{"passport":{"unfragile":{"@version":"1.0","version":"2026-05","artifact":{"id":"xcodeeval","slug":"xcodeeval","name":"xCodeEval","type":"benchmark","url":"https://github.com/ntunlp/xCodeEval","page_url":"https://unfragile.ai/xcodeeval","categories":["testing-quality","rag-knowledge"],"tags":[],"pricing":{"model":"free","free":true,"starting_price":null},"status":"active","verified":false},"capabilities":[{"id":"xcodeeval__cap_0","uri":"capability://data.processing.analysis.multilingual.code.generation.benchmarking.across.17.languages.with.execution.based.validation","name":"multilingual code generation benchmarking across 17 languages with execution-based validation","description":"Provides a standardized evaluation framework for code generation models that accepts generated code in 17 programming languages (C, C++, C#, Java, Kotlin, Go, Rust, Python, Ruby, PHP, JavaScript, Perl, Haskell, OCaml, Scala, D, Pascal) and validates correctness through actual execution against unit tests via the ExecEval Docker-based execution engine. Uses a centralized problem definition model with src_uid foreign keys linking generated code to shared problem descriptions and unittest_db.json, enabling consistent evaluation across language variants of the same problem.","intents":["Evaluate code generation models on multilingual benchmarks with execution-based pass@k metrics","Compare model performance across programming languages using standardized problem sets","Validate generated code correctness by running it against curated unit tests","Train multilingual code generation models on 25M examples with consistent evaluation methodology"],"best_for":["ML researchers evaluating multilingual code LLMs","Teams building cross-language code generation systems","Organizations benchmarking code model performance at scale"],"limitations":["ExecEval execution engine requires Docker — cannot evaluate without containerization","Evaluation latency depends on compilation and test execution time per language","Limited to 17 predefined languages; adding new languages requires compiler integration","Unit test coverage varies by problem; some edge cases may not be caught"],"requires":["Python 3.7+","Hugging Face datasets library (latest)","Docker (latest) for ExecEval execution engine","Git LFS 2.0+ for manual data downloads","16GB+ RAM for processing full dataset","~100GB disk space for complete dataset"],"input_types":["generated code (string, any of 17 languages)","problem_id (src_uid reference)","language identifier"],"output_types":["execution outcome (pass/fail/timeout/error)","pass@k metrics (pass@1, pass@10, etc.)","execution logs and error messages"],"categories":["data-processing-analysis","testing-quality"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_1","uri":"capability://data.processing.analysis.src.uid.based.cross.task.dataset.linking.and.problem.normalization","name":"src_uid-based cross-task dataset linking and problem normalization","description":"Implements a foreign key linking system where all task-specific datasets (program synthesis, code translation, APR, retrieval) reference shared problem definitions via src_uid identifiers. Problem descriptions and unit tests are stored once in centralized problem_descriptions.jsonl and unittest_db.json files, then linked by src_uid to avoid duplication. The Hugging Face datasets API automatically resolves these links during data loading, returning enriched DatasetDict objects with problem context pre-joined to task examples.","intents":["Load task datasets with automatically resolved problem descriptions and unit tests","Ensure consistency across all tasks by using single source of truth for problem definitions","Reduce dataset size and storage overhead through normalization","Navigate between different task views of the same underlying problem"],"best_for":["Researchers working across multiple task types on same problems","Teams building multi-task code understanding systems","Data engineers optimizing storage and consistency"],"limitations":["Manual src_uid linking required when using Git LFS download method (no automatic resolution)","Requires understanding of src_uid schema to perform custom joins","Changes to problem definitions require careful migration to maintain referential integrity"],"requires":["Hugging Face datasets library (for automatic linking) OR","Manual JSON parsing and join logic (for Git LFS method)","Understanding of src_uid field structure and problem_descriptions.jsonl schema"],"input_types":["task dataset (JSONL format)","problem_descriptions.jsonl","unittest_db.json"],"output_types":["DatasetDict with linked problem context","enriched task examples with problem description and test cases"],"categories":["data-processing-analysis","memory-knowledge"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_10","uri":"capability://data.processing.analysis.hugging.face.datasets.api.integration.with.automatic.src.uid.resolution","name":"hugging face datasets api integration with automatic src_uid resolution","description":"Provides a Python API for loading xCodeEval datasets from Hugging Face Hub (NTU-NLP-sg/xCodeEval) with automatic src_uid-based linking between task datasets and shared problem definitions. The datasets library handles data downloading, caching, and streaming, while the xCodeEval integration automatically joins task examples with problem_descriptions.jsonl and unittest_db.json using src_uid foreign keys. Returns DatasetDict objects with enriched examples ready for model training or evaluation.","intents":["Load xCodeEval datasets with minimal setup using Hugging Face API","Automatically resolve src_uid links to problem descriptions and unit tests","Stream large datasets without downloading entire files","Integrate xCodeEval into standard Hugging Face training pipelines"],"best_for":["ML researchers using Hugging Face ecosystem","Teams training models with standard HF training scripts","Organizations wanting minimal setup overhead"],"limitations":["Requires Python 3.7+ and Hugging Face datasets library","Initial download may be slow for full dataset (100GB+)","Streaming mode may have latency for random access patterns","Automatic linking only works with Hugging Face API; Git LFS requires manual joins"],"requires":["Python 3.7+","Hugging Face datasets library (latest)","Internet connection for downloading from Hugging Face Hub","Hugging Face account (optional, for private datasets)"],"input_types":["task name (string: 'program_synthesis', 'code_translation', etc.)","split (string: 'train', 'test', 'validation')"],"output_types":["DatasetDict with linked examples","enriched examples with problem_description and unit_tests fields","dataset metadata and statistics"],"categories":["data-processing-analysis","tool-use-integration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_11","uri":"capability://data.processing.analysis.git.lfs.manual.dataset.download.with.selective.file.access","name":"git lfs manual dataset download with selective file access","description":"Provides an alternative data access method using Git LFS for users who prefer direct file access or need selective dataset downloads. Supports cloning the repository with LFS disabled, then pulling specific task files or problem definitions on demand. Useful for custom processing pipelines or environments where Python/Hugging Face is not available, though requires manual src_uid linking to join task examples with problem definitions.","intents":["Download xCodeEval datasets without Python or Hugging Face dependencies","Selectively download specific task files or languages to reduce storage","Integrate xCodeEval into custom data processing pipelines","Access raw JSONL files for direct manipulation"],"best_for":["Teams with custom data processing pipelines","Organizations without Python/Hugging Face infrastructure","Users needing selective dataset downloads"],"limitations":["Manual src_uid linking required; no automatic join logic","Requires understanding of JSONL format and src_uid schema","No streaming support; must download entire files","Git LFS bandwidth may be limited depending on account","More complex setup than Hugging Face API"],"requires":["Git 2.0+","Git LFS 2.0+","Command-line access","~100GB disk space for full dataset (or less for selective download)"],"input_types":["task name (string)","language identifier (optional, for selective download)"],"output_types":["JSONL files (task examples)","problem_descriptions.jsonl","unittest_db.json"],"categories":["data-processing-analysis","tool-use-integration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_12","uri":"capability://automation.workflow.multi.task.evaluation.pipeline.with.three.phase.execution.model","name":"multi-task evaluation pipeline with three-phase execution model","description":"Implements a standardized three-phase evaluation pipeline (Phase 1: Generation, Phase 2: Execution, Phase 3: Metrics) that applies consistently across all 7 tasks (program synthesis, code translation, APR, tag classification, code compilation, NL-code retrieval, code-code retrieval). Phase 1 generates or retrieves code, Phase 2 executes it via ExecEval or computes retrieval metrics, and Phase 3 aggregates results into pass@k, MRR, NDCG, or other task-specific metrics. Enables direct comparison of model performance across tasks.","intents":["Evaluate models on multiple code understanding tasks with consistent methodology","Compare model performance across generation, translation, repair, and retrieval tasks","Aggregate results into unified metrics for multi-task benchmarking","Build evaluation pipelines that support all 7 xCodeEval tasks"],"best_for":["Teams evaluating multi-task code understanding models","Researchers studying generalization across code tasks","Organizations building comprehensive code model benchmarks"],"limitations":["Phase 2 execution latency depends on code complexity and language","Metrics are task-specific; direct comparison across tasks is not always meaningful","Some tasks (retrieval) use different metrics (MRR) than others (pass@k)","Evaluation time scales linearly with number of examples and samples per example"],"requires":["Generated or retrieved code (Phase 1 output)","ExecEval execution engine (for generation/translation/APR tasks)","Unit test definitions (unittest_db.json)","Task-specific metric computation logic","Docker for execution-based tasks"],"input_types":["task type (string: 'program_synthesis', 'code_translation', etc.)","generated/retrieved code (string)","problem ID (src_uid)","language identifier"],"output_types":["task-specific metrics (pass@k, MRR, NDCG, F1, etc.)","per-example results (pass/fail, rank, etc.)","aggregated statistics and confidence intervals"],"categories":["automation-workflow","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_2","uri":"capability://code.generation.editing.program.synthesis.task.generation.and.evaluation.with.pass.k.metrics","name":"program synthesis task generation and evaluation with pass@k metrics","description":"Evaluates code generation models on the program synthesis task by accepting natural language problem descriptions and generating code solutions in any of 17 languages. The evaluation pipeline (Phase 1: Generation, Phase 2: Execution, Phase 3: Metrics) runs generated code against unit tests via ExecEval, computing pass@k metrics (pass@1, pass@10, etc.) that measure the probability of finding a correct solution within k samples. Supports both single-solution and multi-sample evaluation modes for assessing model reliability.","intents":["Evaluate code generation models on natural language to code synthesis tasks","Measure model performance using pass@k metrics across multiple samples","Compare generation quality across programming languages on identical problems","Train models on 25M program synthesis examples with standardized evaluation"],"best_for":["ML researchers evaluating code LLM generation capabilities","Teams fine-tuning models on program synthesis tasks","Benchmarking studies comparing generation quality across models"],"limitations":["Pass@k metrics require multiple samples per problem, increasing evaluation time","Execution-based evaluation cannot detect subtle logic errors that pass unit tests","Problem difficulty varies significantly; aggregate metrics may mask performance on hard problems","Timeout limits on execution may cause false negatives for slow-running solutions"],"requires":["Generated code samples (string format)","Problem ID (src_uid) for test case lookup","Target programming language identifier","Docker for ExecEval execution","Configured timeout thresholds per language"],"input_types":["natural language problem description (string)","generated code (string, any of 17 languages)","k value for pass@k computation"],"output_types":["pass@k metric (float 0.0-1.0)","individual execution outcomes per sample","execution logs and error traces"],"categories":["code-generation-editing","testing-quality"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_3","uri":"capability://code.generation.editing.code.translation.task.evaluation.with.language.pair.validation","name":"code translation task evaluation with language-pair validation","description":"Evaluates code translation models by accepting source code in one language and generated translations in a target language, then validating functional equivalence through execution against shared unit tests. The translation evaluation pipeline compiles and executes both source and translated code against the same unittest_db.json test cases, comparing outputs to detect translation errors. Supports all 17 language pairs (though not all pairs may have training data) and uses language-specific compiler mappings to handle syntax differences.","intents":["Evaluate code translation models on cross-language translation tasks","Validate that translated code maintains functional equivalence with source","Measure translation quality across all 17 language pairs","Train models on code translation examples with execution-based validation"],"best_for":["Teams building code migration or modernization tools","Researchers studying cross-language code understanding","Organizations evaluating translation model quality"],"limitations":["Functional equivalence validation requires identical test outputs; semantic differences may be missed","Not all 17 language pairs have equal training data coverage","Language-specific idioms and performance characteristics may differ even with correct translations","Compilation errors in target language may indicate translation issues or language limitations"],"requires":["Source code (string, any of 17 languages)","Generated translated code (string, target language)","Problem ID (src_uid) for test case lookup","Language pair identifiers (source and target)","Docker with compilers for both languages","Language & Compiler Mappings configuration"],"input_types":["source code (string)","translated code (string)","source language identifier","target language identifier"],"output_types":["translation correctness (pass/fail)","execution outcome comparison (source vs translation)","compilation errors per language","test output diffs"],"categories":["code-generation-editing","testing-quality"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_4","uri":"capability://code.generation.editing.automatic.program.repair.apr.task.generation.and.evaluation","name":"automatic program repair (apr) task generation and evaluation","description":"Evaluates program repair models by providing buggy code snippets and expecting corrected versions that pass unit tests. The APR evaluation pipeline executes repaired code against unittest_db.json test cases, measuring whether the repair successfully fixes the bug without introducing new failures. Supports repairs across all 17 languages and uses the same execution-based validation as program synthesis, enabling direct comparison of repair quality.","intents":["Evaluate program repair models on buggy code correction tasks","Measure repair success by validating against unit tests","Train models on APR examples with execution-based feedback","Compare repair quality across programming languages"],"best_for":["Teams building code debugging or automated repair tools","Researchers studying program repair techniques","Organizations evaluating repair model performance"],"limitations":["Repair validation only checks unit test pass/fail; may miss partial fixes or performance regressions","Bug types and difficulty vary significantly across problems","Some bugs may have multiple valid repairs; metrics only measure one correct solution","Execution-based validation cannot detect repairs that pass tests but introduce subtle logic errors"],"requires":["Buggy code snippet (string, any of 17 languages)","Repaired code (string, same language)","Problem ID (src_uid) for test case lookup","Docker for ExecEval execution","Language compiler configuration"],"input_types":["buggy code (string)","repaired code (string)","language identifier"],"output_types":["repair success (pass/fail)","execution outcome (before/after repair)","test case results","compilation errors"],"categories":["code-generation-editing","testing-quality"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_5","uri":"capability://search.retrieval.natural.language.to.code.retrieval.with.semantic.matching","name":"natural language to code retrieval with semantic matching","description":"Provides a retrieval task where models must find the correct code implementation given a natural language problem description, using a corpus of 7,500 unique code solutions across 17 languages. The retrieval evaluation uses semantic matching against a retrieval corpus (stored separately from task datasets) to measure ranking quality via metrics like MRR (Mean Reciprocal Rank) or NDCG. Supports both single-language and cross-language retrieval scenarios.","intents":["Evaluate code retrieval models on NL-to-code matching tasks","Measure semantic understanding by ranking code solutions by relevance to descriptions","Train embedding models on NL-code pairs for semantic search","Build code search systems that understand natural language queries"],"best_for":["Teams building code search or code recommendation systems","Researchers studying semantic code understanding","Organizations evaluating code retrieval model quality"],"limitations":["Retrieval metrics (MRR, NDCG) measure ranking quality, not functional correctness","Single correct answer assumption may miss semantically equivalent solutions","Corpus size (7,500 problems) is small compared to real code repositories","Language-specific retrieval may perform differently due to uneven training data distribution"],"requires":["Natural language problem description (string)","Retrieval corpus (code solutions, indexed)","Embedding model or retrieval ranker","Retrieval corpus metadata (language, problem_id)"],"input_types":["natural language query (string)","code corpus (strings, any of 17 languages)"],"output_types":["ranked list of code solutions","retrieval metrics (MRR, NDCG, recall@k)","relevance scores per solution"],"categories":["search-retrieval","memory-knowledge"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_6","uri":"capability://search.retrieval.code.to.code.retrieval.with.structural.similarity.matching","name":"code-to-code retrieval with structural similarity matching","description":"Evaluates code retrieval models on finding semantically similar code implementations given a query code snippet, using structural and semantic matching against the retrieval corpus. Unlike NL-code retrieval, this task measures code-to-code similarity across language variants of the same problem or functionally equivalent solutions in different languages. Supports both same-language and cross-language code matching.","intents":["Evaluate code clone detection and similarity models","Find functionally equivalent code across programming languages","Measure code understanding by ranking similar implementations","Build code deduplication or code recommendation systems"],"best_for":["Teams building code clone detection systems","Researchers studying cross-language code similarity","Organizations evaluating code matching model quality"],"limitations":["Code similarity is subjective; multiple valid similar solutions may exist","Structural differences (variable names, formatting) may affect matching","Cross-language matching is harder due to language-specific idioms","Corpus size (7,500 problems) limits diversity of similarity patterns"],"requires":["Query code snippet (string, any of 17 languages)","Code retrieval corpus (indexed)","Code similarity metric or embedding model","Language identifiers for query and corpus"],"input_types":["query code (string)","code corpus (strings, any of 17 languages)"],"output_types":["ranked list of similar code snippets","retrieval metrics (MRR, NDCG, recall@k)","similarity scores per match"],"categories":["search-retrieval","memory-knowledge"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_7","uri":"capability://data.processing.analysis.tag.classification.for.code.understanding.and.categorization","name":"tag classification for code understanding and categorization","description":"Provides a code understanding task where models classify code snippets with semantic tags (e.g., algorithm type, data structure, complexity class). The tag classification dataset includes code examples with associated tags across all 17 languages, enabling evaluation of whether models understand code semantics beyond syntax. Uses standard multi-label classification metrics to measure tagging accuracy.","intents":["Evaluate code understanding models on semantic classification tasks","Measure whether models can identify algorithm types and patterns in code","Train models to understand code semantics and categorization","Build code analysis systems that automatically tag code with semantic labels"],"best_for":["Teams building code analysis or documentation systems","Researchers studying code semantic understanding","Organizations evaluating code classification model quality"],"limitations":["Tag definitions may be ambiguous; some code may fit multiple categories","Tag distribution may be imbalanced across problem types","Language-specific idioms may affect tag applicability","Multi-label classification metrics may not capture partial correctness"],"requires":["Code snippet (string, any of 17 languages)","Tag vocabulary (predefined set of semantic labels)","Classification model or embedding-based ranker"],"input_types":["code snippet (string)","tag vocabulary (list of strings)"],"output_types":["predicted tags (list of strings)","tag confidence scores","classification metrics (precision, recall, F1)"],"categories":["data-processing-analysis","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_8","uri":"capability://code.generation.editing.code.compilation.and.syntax.validation.across.17.languages","name":"code compilation and syntax validation across 17 languages","description":"Provides a code compilation task that validates whether generated or translated code compiles successfully in its target language, using language-specific compiler mappings and configurations. The compilation evaluation is integrated into the ExecEval execution engine, which handles compiler invocation, error capture, and timeout management for each of the 17 supported languages. Returns detailed compilation errors and warnings for debugging.","intents":["Validate that generated code is syntactically correct before execution","Measure code generation quality by compilation success rate","Debug code generation failures by capturing compiler error messages","Train models to generate syntactically valid code in multiple languages"],"best_for":["Teams evaluating code generation model quality","Researchers studying syntax error patterns in generated code","Organizations building code generation systems"],"limitations":["Compilation success does not guarantee functional correctness","Compiler error messages vary significantly across languages","Some languages (Python, Ruby) have no compile step; validation is runtime-only","Timeout limits may cause false negatives for slow compilation"],"requires":["Generated code (string, any of 17 languages)","Language identifier","Docker with language-specific compilers installed","Language & Compiler Mappings configuration","Timeout thresholds per language"],"input_types":["code snippet (string)","language identifier"],"output_types":["compilation status (success/failure)","compiler error messages (string)","compilation time (milliseconds)","warnings (if applicable)"],"categories":["code-generation-editing","testing-quality"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__cap_9","uri":"capability://automation.workflow.execeval.docker.based.execution.engine.with.language.specific.isolation","name":"execeval docker-based execution engine with language-specific isolation","description":"Provides a containerized execution environment (ExecEval) that safely runs generated code in isolated Docker containers, with language-specific compiler and runtime configurations. The engine handles compilation, execution, timeout management, and output capture for all 17 languages, returning structured execution outcomes (pass/fail/timeout/error). Supports configurable timeout thresholds per language and prevents resource exhaustion through container limits.","intents":["Safely execute untrusted generated code without compromising host system","Run code in language-specific environments with proper compilers and runtimes","Capture execution outputs and errors for debugging","Enforce timeout limits to prevent infinite loops or resource exhaustion"],"best_for":["Teams evaluating code generation models at scale","Researchers running large-scale code benchmarks","Organizations building code execution platforms"],"limitations":["Docker requirement adds infrastructure overhead and setup complexity","Execution latency depends on container startup time and code runtime","Network access and file I/O are restricted in containers for security","Language-specific runtime differences may affect execution behavior","Timeout limits are fixed per language; some legitimate code may timeout"],"requires":["Docker (latest version)","ExecEval setup and configuration (see ExecEval Setup documentation)","Language-specific compiler/runtime Docker images","Configured timeout thresholds per language","Unit test definitions (unittest_db.json)"],"input_types":["code snippet (string, any of 17 languages)","unit tests (JSON format)","language identifier","timeout threshold (milliseconds)"],"output_types":["execution outcome (pass/fail/timeout/error)","execution logs (stdout/stderr)","test results (passed/failed test count)","execution time (milliseconds)"],"categories":["automation-workflow","tool-use-integration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"xcodeeval__headline","uri":"capability://testing.quality.multilingual.code.evaluation.benchmark","name":"multilingual code evaluation benchmark","description":"xCodeEval is a comprehensive multilingual benchmark for evaluating code intelligence across 17 programming languages, facilitating tasks like code generation, translation, and understanding, making it essential for developers assessing model performance in diverse coding environments.","intents":["best multilingual code evaluation benchmark","code evaluation framework for multiple programming languages","top tools for assessing code generation models","cross-lingual code intelligence assessment solutions","best benchmarks for code understanding tasks"],"best_for":["researchers in NLP","developers working with multiple programming languages"],"limitations":[],"requires":["Python 3.7+","Docker for execution"],"input_types":["code snippets","programming tasks"],"output_types":["evaluation metrics","execution results"],"categories":["testing-quality","rag-knowledge"],"confidence":0.5,"matches":0,"success_rate":0}],"trust":{"score":64,"verified":false,"data_access_risk":"low","permissions":["Python 3.7+","Hugging Face datasets library (latest)","Docker (latest) for ExecEval execution engine","Git LFS 2.0+ for manual data downloads","16GB+ RAM for processing full dataset","~100GB disk space for complete dataset","Hugging Face datasets library (for automatic linking) OR","Manual JSON parsing and join logic (for Git LFS method)","Understanding of src_uid field structure and problem_descriptions.jsonl schema","Internet connection for downloading from Hugging Face Hub"],"failure_modes":["ExecEval execution engine requires Docker — cannot evaluate without containerization","Evaluation latency depends on compilation and test execution time per language","Limited to 17 predefined languages; adding new languages requires compiler integration","Unit test coverage varies by problem; some edge cases may not be caught","Manual src_uid linking required when using Git LFS download method (no automatic resolution)","Requires understanding of src_uid schema to perform custom joins","Changes to problem definitions require careful migration to maintain referential integrity","Requires Python 3.7+ and Hugging Face datasets library","Initial download may be slow for full dataset (100GB+)","Streaming mode may have latency for random access patterns","builder identity is not verified yet","no observed match outcomes yet"],"rank_breakdown":{"adoption":0.7,"quality":0.9,"ecosystem":0.49999999999999994,"match_graph":0.25,"freshness":0.52,"weights":{"adoption":0.25,"quality":0.35,"ecosystem":0.15,"match_graph":0.2,"freshness":0.05}},"observed_outcomes":{"matches":0,"success_rate":0,"avg_confidence":0,"top_intents":[],"last_matched_at":null},"maintenance":{"status":"active","updated_at":"2026-06-17T09:51:05.297Z","last_scraped_at":null,"last_commit":null},"community":{"stars":null,"forks":null,"weekly_downloads":null,"model_downloads":null,"model_likes":null}},"distribution":{"claim_url":"https://unfragile.ai/submit?claim=xcodeeval","compare_url":"https://unfragile.ai/compare?artifact=xcodeeval"}},"signature":"q/gTrbXHb4463m4P07DeNmmmeTltoswEPsQ/I5+0XVyF2fb1gTh/mYUZvumFbAY7uL/aF6MbwN3rWlfmQYXqDw==","signedAt":"2026-06-21T08:43:18.056Z","signedBy":"unfragile.ai","version":1},"_links":{"self":"https://unfragile.ai/api/v1/passport/xcodeeval","artifact":"https://unfragile.ai/xcodeeval","verify":"https://unfragile.ai/api/v1/verify?slug=xcodeeval","publicKey":"https://unfragile.ai/api/v1/trust-passport-public-key","spec":"https://unfragile.ai/trust","schema":"https://unfragile.ai/schema.json","docs":"https://unfragile.ai/docs"}}