PDFMathTranslate
MCP ServerFree[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
Capabilities14 decomposed
layout-preserving pdf translation with structural reconstruction
Medium confidenceTranslates PDF scientific documents while maintaining original layout, columns, spacing, and positioning through a five-stage pipeline: PDF parsing via PDFConverterEx/PDFPageInterpreterEx for structure detection, content classification (text/formula/figure/table), AI-powered translation with caching, and document reconstruction via PyMuPDF with font injection. Uses font pattern matching to detect and preserve mathematical formulas during translation, preventing corruption of equations and special symbols.
Uses font pattern matching in PDFConverterEx to detect mathematical formulas and preserve them as untranslatable elements, combined with BabelDOC backend for intelligent content classification and PyMuPDF-based reconstruction that maintains precise spatial positioning and multi-column layouts — most competitors either lose formatting or fail on math-heavy documents
Outperforms generic PDF translators (Google Translate, Microsoft Translator) by preserving mathematical formulas and complex layouts; outperforms academic-focused tools by supporting 24+ translation services and local LLMs instead of single-provider lock-in
multi-service translation engine with intelligent caching
Medium confidenceAbstracts 24+ translation services (Google Translate, DeepL, OpenAI, Anthropic, Ollama, etc.) behind a unified BaseTranslator interface, routing requests based on configuration and cost optimization. Implements SQLite-based translation cache that stores previously translated segments, reducing redundant API calls and costs. Supports custom prompts per service and batch processing via thread pools for parallel translation of document segments.
Implements BaseTranslator subclass pattern with pluggable service adapters (Google, DeepL, OpenAI, Anthropic, Ollama) plus SQLite-based segment caching that tracks translation history and cost per service — enables cost-aware routing and provider fallback without reprocessing cached content
More flexible than single-provider solutions (Google Translate API, DeepL API) by supporting local LLMs and caching; more cost-effective than cloud-only services by reducing redundant API calls through intelligent caching
intelligent translation caching with segment deduplication
Medium confidenceSQLite-based translation cache (TranslationCache class) stores previously translated segments with metadata (source text, target language, service, timestamp). Implements exact-match deduplication to avoid re-translating identical phrases, reducing API costs and improving performance. Cache is persistent across sessions and supports cache invalidation, statistics tracking, and cost analysis per service.
TranslationCache class in pdf2zh/cache.py uses SQLite with segment hashing for exact-match deduplication, tracking cost per service and enabling cache statistics — enables cost-aware translation routing and audit trails without external dependencies
More cost-effective than stateless translation by eliminating redundant API calls; more auditable than in-memory caches by persisting to SQLite with metadata
pdf parsing with layout-aware content extraction
Medium confidencePDFConverterEx and PDFPageInterpreterEx classes parse PDF structure to extract text with precise spatial coordinates, column detection, and reading order inference. Uses PyMuPDF's layout analysis to identify text blocks, figures, tables, and headers/footers, enabling content-aware translation that respects document structure. Handles complex layouts (multi-column, rotated text, overlapping elements) through geometric analysis.
PDFConverterEx and PDFPageInterpreterEx in pdf2zh/pdf_parser.py use PyMuPDF's layout analysis to extract text with precise coordinates and infer reading order through geometric analysis — enables column-aware translation and layout-preserving reconstruction
More layout-aware than simple text extraction (pdfplumber, PyPDF2) by using geometric analysis; more accurate than regex-based column detection by leveraging PDF structure
exception handling and error recovery with fallback strategies
Medium confidenceImplements comprehensive exception handling throughout translation pipeline with automatic fallback strategies: if primary translation service fails, automatically retries with secondary service; if PDF parsing fails, attempts alternative parsing methods; if font embedding fails, falls back to system fonts. Logs detailed error context for debugging and provides user-friendly error messages.
Exception handling in pdf2zh/exceptions.py implements multi-level fallback: service failure → retry with backoff → fallback to secondary service → skip segment with warning — enables graceful degradation without stopping entire translation pipeline
More resilient than fail-fast approaches by implementing automatic fallback; more transparent than silent error suppression by logging detailed context
configuration management with environment variable and file-based settings
Medium confidenceCentralized configuration system (pdf2zh/config.py) supporting YAML/JSON configuration files, environment variables, and command-line arguments with hierarchical precedence. Enables users to configure translation services, custom prompts, font paths, cache settings, thread pool size, and logging without modifying code. Configuration is validated on load and provides helpful error messages for invalid settings.
Configuration system in pdf2zh/config.py supports hierarchical precedence (CLI args > env vars > config file > defaults) with YAML/JSON parsing and validation — enables flexible deployment across environments without code changes
More flexible than hardcoded settings by supporting multiple configuration sources; more user-friendly than CLI-only configuration by supporting configuration files
content-aware classification and preservation system
Medium confidenceClassifies PDF content into four categories (text, mathematical formulas, figures, tables) using font pattern matching and layout heuristics, then applies service-specific handling: text gets translated, formulas/figures/tables are preserved as-is or minimally modified. Uses TranslateConverter class with font exception handling to detect mathematical notation (subscripts, superscripts, special Unicode ranges) and prevent translation of non-translatable elements.
Uses font pattern matching in TranslateConverter to detect mathematical notation by analyzing font properties (subscript/superscript flags, Unicode ranges for mathematical alphanumeric symbols U+1D400-U+1D7FF) rather than regex or heuristics — enables accurate formula preservation without false positives
More accurate than regex-based formula detection used by some competitors; more efficient than OCR-based approaches by leveraging PDF font metadata directly
mcp server interface for llm-native document translation
Medium confidenceExposes PDFMathTranslate as a Model Context Protocol (MCP) server via pdf2zh/mcp.py, allowing LLM applications (Claude, ChatGPT with MCP support) to invoke translation operations as native tools. Implements MCP resource and tool schemas for document upload, translation configuration, and result retrieval, enabling seamless integration into agentic workflows without custom API wrappers.
Implements full MCP server protocol (pdf2zh/mcp.py) with resource and tool schemas, allowing LLMs to treat PDF translation as a native capability rather than external API — enables agentic workflows where document translation is a first-class operation alongside reasoning and planning
More integrated than REST API approaches by leveraging MCP's native LLM tool calling; more flexible than single-LLM plugins by supporting any MCP-compatible application
multi-interface deployment with cli, gui, api, and docker
Medium confidenceProvides five distinct entry points for the same translation engine: CLI (pdf2zh/__main__.py) for batch scripting, Gradio-based Web GUI (pdf2zh/gui.py) for interactive use, Flask HTTP API (pdf2zh/api.py) for service integration, Python API for programmatic access, and Docker containers for containerized deployment. All interfaces share the same core translate() and translate_stream() functions, enabling consistent behavior across deployment models.
Implements five independent entry points (CLI, Gradio GUI, Flask API, Python API, Docker) all delegating to shared translate() and translate_stream() core functions in pdf2zh/high_level.py — enables single codebase to serve CLI users, web users, API consumers, and containerized deployments without duplication
More accessible than API-only solutions by providing GUI and CLI; more flexible than single-interface tools by supporting both interactive and batch workflows; more deployable than desktop-only tools by supporting containerization
zotero plugin integration for bibliography-aware translation
Medium confidenceProvides Zotero plugin that intercepts PDF imports and automatically translates documents while preserving bibliography metadata, citations, and reference formatting. Integrates with Zotero's document management system to store both original and translated PDFs, enabling researchers to maintain bilingual reference libraries without manual file management.
Zotero plugin (pdf2zh/zotero_plugin.py) hooks into Zotero's document import pipeline to automatically trigger translation while preserving bibliography metadata and maintaining bilingual library structure — enables seamless workflow integration without requiring researchers to manually invoke translation tools
More integrated than manual translation workflows; more bibliography-aware than generic PDF translators that ignore citation metadata
streaming translation with progressive pdf reconstruction
Medium confidenceImplements translate_stream() function that yields translated segments progressively rather than buffering entire document, enabling real-time progress feedback and memory-efficient processing of large PDFs. Reconstructs PDF incrementally as segments complete translation, allowing users to see partial results before full document finishes processing.
translate_stream() generator in pdf2zh/high_level.py yields translation results segment-by-segment while incrementally reconstructing PDF via PyMuPDF, enabling real-time progress UI and memory-efficient processing — most competitors buffer entire documents before reconstruction
More responsive than batch-only approaches by providing real-time feedback; more memory-efficient than buffering entire documents; more suitable for web applications requiring streaming responses
font management and multilingual character support
Medium confidenceManages font substitution and injection for target languages, detecting missing glyphs and automatically selecting appropriate fonts from system or bundled font library. Supports CJK (Chinese, Japanese, Korean), Cyrillic, Arabic, and other scripts by embedding fonts into reconstructed PDFs, ensuring translated documents render correctly regardless of system font availability.
Font management system in pdf2zh/font_manager.py detects missing glyphs for target language, selects appropriate fonts from system or bundled library, and embeds them into reconstructed PDFs — enables correct rendering of CJK, Cyrillic, and other scripts without requiring target language fonts on user's system
More robust than solutions relying on system fonts (which may be unavailable); more comprehensive than single-script solutions by supporting CJK, Cyrillic, Arabic, and other scripts
custom prompt engineering per translation service
Medium confidenceAllows users to define custom prompts for each translation service (OpenAI, Anthropic, Ollama, etc.) to control translation style, terminology, and domain-specific handling. Prompts are stored in configuration files and applied per-segment, enabling fine-grained control over translation quality without modifying code. Supports prompt templating with variables for context (document title, language pair, segment number).
Configuration-driven prompt system in pdf2zh/config.py allows per-service custom prompts with variable templating (document context, language pair, segment metadata) — enables domain-specific translation tuning without code changes or service-specific API wrappers
More flexible than fixed-prompt solutions by allowing customization per service; more accessible than code-based prompt engineering by using configuration files
batch processing with thread pool parallelization
Medium confidenceImplements multi-threaded translation execution via thread pool in translation engine, allowing parallel processing of document segments across multiple CPU cores. Configurable thread count balances parallelism against API rate limits and memory usage. Handles thread-safe access to translation cache and manages concurrent API requests to avoid rate limiting.
Thread pool implementation in pdf2zh/translate.py with configurable worker count and thread-safe cache access enables parallel segment translation while respecting API rate limits — balances throughput against rate limit constraints better than sequential processing
Faster than sequential translation for multi-segment documents; more rate-limit-aware than naive parallelization by implementing backoff and queue management
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with PDFMathTranslate, ranked by overlap. Discovered automatically through the match graph.
Immersive Translate
Bilingual side-by-side webpage translation extension.
X-doc AI
The most accurate AI translator
SeamlessM4T: Massively Multilingual & Multimodal Machine Translation (SeamlessM4T)
### Reinforcement Learning <a name="2023rl"></a>
Genius PDF
Transform PDFs with AI: comprehend, translate, store...
Immersive Translate
Revolutionize your web experience with seamless, customizable, bilingual translations across...
Llama 3.1 405B
Largest open-weight model at 405B parameters.
Best For
- ✓researchers translating scientific papers across languages
- ✓academic institutions processing bulk document translations
- ✓teams requiring bilingual document archives with preserved formatting
- ✓teams managing translation costs across multiple documents
- ✓organizations with privacy requirements (using local LLM backends)
- ✓developers building translation-as-a-service platforms
- ✓researchers comparing translation quality across providers
- ✓organizations translating multiple documents with overlapping content
Known Limitations
- ⚠Complex handwritten annotations may not translate accurately
- ⚠OCR-dependent PDFs (scanned documents) require additional preprocessing
- ⚠Multi-language documents in single PDF may have inconsistent translation quality
- ⚠Font substitution may occur if target language fonts unavailable in system
- ⚠Cache hits only work for exact phrase matches; paraphrased content bypasses cache
- ⚠Thread pool adds ~50-200ms overhead per batch depending on segment count
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
Repository Details
Last commit: Apr 20, 2026
About
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
Categories
Alternatives to PDFMathTranslate
Are you the builder of PDFMathTranslate?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →