json-repair
ModelFreeRepair JSON! A Java library for fixing JSON anomalies generated by LLMs.
Capabilities11 decomposed
antlr-based malformed json structural repair with strategy pattern orchestration
Medium confidenceRepairs syntactically broken JSON by using ANTLR parser to identify structural errors (missing braces, brackets, parentheses) and applies configurable repair strategies (SimpleRepairStrategy, CorrectRepairStrategy) to fix them. The JSONRepair orchestrator class manages the repair pipeline, attempting fixes iteratively up to a configurable limit, with error context tracking via the Expecting class to understand what tokens are missing at failure points.
Uses ANTLR-based syntax-aware parsing with strategy pattern for multi-pass repair attempts, rather than regex-based string manipulation; tracks error context via Expecting class to understand what tokens are missing at specific parse failure points, enabling targeted repairs instead of blind string patching
More structurally aware than regex-based JSON repair tools because it parses the full token stream and understands nesting depth, allowing it to correctly repair complex nested structures where simpler tools would fail or produce invalid output
json content extraction from mixed text with fallback repair
Medium confidenceExtracts valid JSON objects or arrays from larger text blocks (e.g., LLM responses with explanatory text before/after JSON) using SimpleExtractStrategy, which scans for JSON delimiters and isolates contiguous JSON content. Extracted JSON is then passed through the repair pipeline if it contains anomalies, enabling end-to-end recovery of structured data from unstructured LLM outputs.
Combines extraction (SimpleExtractStrategy) with repair in a single pipeline, so extracted JSON that is malformed is automatically repaired; most tools extract OR repair, not both in sequence
Handles the full end-to-end workflow of extracting JSON from noisy LLM text and fixing it in one call, whereas regex-based extractors require separate repair steps and often fail on partially-formed JSON
integration test suite with diverse json anomaly scenarios
Medium confidenceIncludes comprehensive integration tests (IntegrationTests class) covering a wide range of JSON anomalies produced by LLMs: missing braces/brackets, unquoted keys/values, trailing commas, missing outer delimiters, and nested structure errors. Tests are organized by anomaly type and include both positive cases (repair succeeds) and negative cases (repair fails gracefully), providing confidence in repair behavior across different LLM output patterns.
Organizes tests by JSON anomaly type with explicit test cases for each repair strategy, providing clear visibility into what anomalies are handled and which are not; most JSON repair tools lack comprehensive test documentation
Provides explicit test coverage for different LLM output anomalies, enabling developers to understand repair behavior and limitations before integrating into production systems
configurable multi-pass repair attempt strategy with iteration limits
Medium confidenceImplements a configurable repair pipeline via JSONRepairConfig that allows developers to set maximum repair attempt counts and extraction modes. The JSONRepair orchestrator applies repair strategies iteratively, re-parsing after each fix attempt until either the JSON is valid or the attempt limit is reached. This prevents infinite loops while allowing heuristic-based repairs to converge on valid output through multiple passes.
Exposes repair attempt limits and extraction mode as first-class configuration parameters via JSONRepairConfig, allowing developers to tune repair behavior without modifying code; most JSON repair tools have fixed repair logic with no tuning surface
Provides explicit control over repair aggressiveness and resource consumption, whereas most JSON repair libraries apply a fixed set of heuristics with no way to adjust behavior for different LLM output characteristics
error context tracking and missing token identification via expecting class
Medium confidenceTracks parse error context through the Expecting class, which records what tokens the parser expected at the point of failure (e.g., 'expected }' or 'expected ]'). This error context is used by repair strategies to make targeted fixes rather than blind string manipulation. When ANTLR parsing fails, the Expecting object captures the expected token type and position, enabling the repair strategy to insert the correct missing delimiter at the right location.
Uses ANTLR error listener integration to capture expected token context at parse failure points, enabling context-aware repairs; most JSON repair tools use simple regex or string-based heuristics without understanding what the parser expected
Provides semantic understanding of parse failures through token expectations, allowing repairs to be targeted and correct, whereas blind string manipulation approaches often produce invalid JSON or incorrect repairs
unquoted key and value normalization with automatic quote insertion
Medium confidenceRepairs JSON where keys or values lack quotation marks (e.g., {f:v} instead of {"f":"v"}) by detecting unquoted identifiers and automatically inserting quotes around them. This is handled as part of the SimpleRepairStrategy, which identifies tokens that should be strings but lack delimiters and wraps them in quotes during the repair pass.
Integrates quote insertion into the ANTLR-based repair pipeline, so unquoted keys/values are identified during parsing and fixed in context, rather than using post-hoc regex replacement which can miss edge cases
More accurate than regex-based quote insertion because it understands JSON structure and nesting, avoiding false positives in edge cases like unquoted values in nested objects
redundant comma removal and array/object cleanup
Medium confidenceRemoves redundant or trailing commas in JSON arrays and objects (e.g., [1,2,] becomes [1,2]) as part of the SimpleRepairStrategy. The repair logic detects comma tokens that appear before closing brackets or braces and removes them, producing valid JSON that conforms to the JSON specification which disallows trailing commas.
Integrates comma removal into the ANTLR-based repair pipeline with token-level awareness, so commas are removed only when they appear before closing delimiters, avoiding false positives in string values or nested structures
More precise than regex-based comma removal because it understands JSON token boundaries and nesting, avoiding accidental removal of commas in string values or nested arrays
missing outer brace/bracket completion for partial json structures
Medium confidenceAutomatically adds missing outermost braces or brackets to convert partial JSON fragments into valid JSON objects or arrays. For example, converts [1,2,3 to [1,2,3] or {"key":"value" to {"key":"value"}. This is implemented in SimpleRepairStrategy by detecting unclosed top-level delimiters and inserting the corresponding closing delimiter at the end of the input.
Detects unclosed top-level delimiters via ANTLR parsing and adds the corresponding closing delimiter, rather than using heuristic string matching; this ensures the added delimiter is correct for the structure type
More reliable than simple string-based approaches (e.g., appending '}' if input starts with '{') because it understands nesting depth and can correctly close nested structures
null value insertion for missing object properties and array elements
Medium confidenceFills missing values in JSON objects and arrays with null when a key is present but has no value, or when an array element is missing. For example, converts {"key":} to {"key":null} or [1,,3] to [1,null,3]. This is part of SimpleRepairStrategy and ensures that all keys have values and all array positions are filled, producing valid JSON that can be parsed without type errors.
Integrates null insertion into the ANTLR-based repair pipeline with awareness of JSON structure (objects vs. arrays), so null is inserted in the correct context rather than blindly replacing missing values
More context-aware than simple string replacement because it understands whether a missing value is in an object property or array element, and inserts null in the correct syntactic position
jmh-based performance benchmarking and latency profiling
Medium confidenceIncludes built-in JMH (Java Microbenchmark Harness) benchmarks via BenchmarkTests class that measure repair latency for different JSON complexity levels (simple objects, arrays with missing brackets, nested structures, unquoted keys). Benchmarks are executed as part of the test suite and generate detailed performance reports (benchmark_0.2.2.json) showing nanosecond-level timing for each repair operation, enabling developers to understand repair overhead and optimize for their use case.
Includes JMH-based benchmarks as part of the library itself, providing reproducible performance measurements for different JSON repair scenarios; most JSON repair tools do not include built-in benchmarking infrastructure
Enables developers to measure repair latency directly in their environment using industry-standard JMH framework, rather than relying on external benchmarking tools or documentation
maven-based build and dependency management with antlr code generation
Medium confidenceUses Maven as the build system with integrated ANTLR code generation plugin (maven-antlr4-plugin) that automatically generates parser and lexer classes from ANTLR grammar files during the build process. The pom.xml configuration manages dependencies (ANTLR runtime, JSON libraries, testing frameworks) and build phases, enabling reproducible builds and easy integration into Java projects via Maven Central repository.
Integrates ANTLR code generation into Maven build lifecycle via maven-antlr4-plugin, ensuring parser code is always in sync with grammar files; most JSON repair tools either use pre-generated parsers or require manual code generation
Provides seamless integration into Maven-based Java projects with automatic dependency resolution and ANTLR code generation, reducing setup friction compared to tools that require manual parser generation or custom build steps
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with json-repair, ranked by overlap. Discovered automatically through the match graph.
partial-json
Parse partial JSON generated by LLM
Mistral Large (123B)
Mistral Large — powerful reasoning and instruction-following
OpenAI: GPT-4 Turbo Preview
The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023. **Note:** heavily rate limited by OpenAI while...
Mistral: Mistral Medium 3.1
Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances...
Qwen: Qwen3 235B A22B Instruct 2507
Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...
Anthropic: Claude 3.5 Haiku
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic...
Best For
- ✓Java backend developers integrating LLM outputs into production systems
- ✓Teams building LLM-powered APIs that need post-processing guardrails
- ✓Developers working with streaming JSON responses that may be truncated
- ✓Developers parsing LLM chat completions that mix natural language with JSON
- ✓Teams building chatbot backends that need to extract structured data from conversational responses
- ✓Builders of multi-step LLM pipelines where intermediate outputs are JSON embedded in text
- ✓Teams evaluating json-repair for production use
- ✓Developers extending repair strategies and needing regression test coverage
Known Limitations
- ⚠ANTLR parsing adds ~0.04-0.09ms per repair operation; not suitable for real-time sub-millisecond requirements
- ⚠Repair strategies are heuristic-based and may produce semantically incorrect JSON if the original intent is ambiguous
- ⚠No support for repairing deeply nested structures with multiple simultaneous errors; repairs are applied iteratively with diminishing returns
- ⚠Limited to JSON format only; cannot repair JSONL, NDJSON, or other streaming JSON variants
- ⚠Extraction assumes JSON is contiguous; cannot handle interleaved JSON and text (e.g., 'key: {nested' followed by explanation then '}' )
- ⚠Extraction heuristic is delimiter-based and may incorrectly identify JSON-like structures in code blocks or examples
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
Repository Details
Last commit: Feb 14, 2026
About
Repair JSON! A Java library for fixing JSON anomalies generated by LLMs.
Categories
Alternatives to json-repair
Are you the builder of json-repair?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →