{"passport":{"unfragile":{"@version":"1.0","version":"2026-05","artifact":{"id":"vbench","slug":"vbench","name":"VBench","type":"benchmark","url":"https://vchitect.github.io/VBench-project","page_url":"https://unfragile.ai/vbench","categories":["testing-quality"],"tags":[],"pricing":{"model":"free","free":true,"starting_price":null},"status":"active","verified":false},"capabilities":[{"id":"vbench__cap_0","uri":"capability://data.processing.analysis.multi.dimensional.video.generation.quality.scoring","name":"multi-dimensional video generation quality scoring","description":"Evaluates generated videos across 16 hierarchical dimensions (subject consistency, temporal flickering, motion smoothness, aesthetic quality, text-video alignment, and 11 others) using dimension-specific automatic objective evaluation pipelines. Each dimension employs tailored metrics designed to isolate and measure distinct aspects of video quality, with results aggregated into per-dimension scores and an overall quality assessment. The evaluation framework stratifies test cases across diverse prompt categories to ensure comprehensive coverage of video generation scenarios.","intents":["I need to objectively measure how well a text-to-video or image-to-video model generates videos across multiple quality dimensions","I want to compare video generation models using a standardized, multi-faceted evaluation framework rather than single-metric scoring","I need to identify which specific aspects of video quality (consistency, motion, aesthetics, alignment) my model excels or struggles with","I want to benchmark my video generation model against SOTA models using a comprehensive, research-backed evaluation suite"],"best_for":["video generation model developers evaluating text-to-video and image-to-video systems","research teams publishing video generation papers requiring standardized benchmarking","AI labs comparing multiple video generation architectures across quality dimensions","companies assessing video generation model performance before production deployment"],"limitations":["Specific evaluation metrics per dimension not fully documented in public materials — requires consulting full CVPR 2024 paper for implementation details","Exact test set size and composition unknown — documentation states 'diverse prompt categories' but specific category definitions and prompt counts not provided","No public leaderboard or submission mechanism documented — benchmark appears designed for research evaluation rather than continuous model ranking","Human alignment validation methodology unclear — claims results align with human perception but inter-rater agreement coefficients and sample sizes not disclosed","Computational cost and runtime requirements for full benchmark evaluation not specified"],"requires":["Generated video files from a text-to-video or image-to-video model","Original prompts (text or image+text pairs) used to generate videos","Access to VBench evaluation code from GitHub repository (language/framework specifics unknown)","Sufficient computational resources to run automatic evaluation pipelines (exact requirements unknown)"],"input_types":["video files (generated by text-to-video or image-to-video models)","text prompts (descriptions of desired video content)","image files (for image-to-video evaluation in VBench+ variant)"],"output_types":["per-dimension quality scores (16 dimensions)","per-category performance metrics (stratified by prompt category)","overall video generation quality score","structured evaluation report with dimension-specific breakdowns"],"categories":["data-processing-analysis","testing-quality","benchmark"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_1","uri":"capability://data.processing.analysis.subject.consistency.evaluation.across.video.frames","name":"subject consistency evaluation across video frames","description":"Measures whether the primary subject (person, object, character) maintains visual consistency and identity throughout the generated video without morphing, disappearing, or changing appearance. Uses automatic objective evaluation methods (likely CLIP-based embeddings or optical flow analysis, specifics unknown) to quantify frame-to-frame subject stability. Evaluates consistency across diverse prompt categories to ensure the metric generalizes across different subject types and video scenarios.","intents":["I need to measure whether my video generation model maintains consistent character/object identity across all frames","I want to identify if my model is generating videos where subjects morph, flicker, or change appearance unexpectedly","I need to quantify subject stability as a distinct quality dimension separate from overall video quality"],"best_for":["developers of character-driven video generation models","teams evaluating identity preservation in text-to-video systems","researchers studying temporal consistency in generative video"],"limitations":["Specific evaluation method (CLIP similarity, optical flow, face detection, etc.) not documented","Definition of 'subject' and how it handles multi-subject videos unclear","No information on how consistency is measured across videos of different lengths or frame rates"],"requires":["Generated video files with identifiable subjects","Original prompts describing the subject","VBench evaluation pipeline with subject consistency module"],"input_types":["video files (generated videos with subjects)"],"output_types":["subject consistency score (0-1 or 0-100 scale, specifics unknown)","per-frame consistency metrics (if available)"],"categories":["data-processing-analysis","image-visual"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_10","uri":"capability://data.processing.analysis.downloadable.benchmark.dataset.and.test.suite","name":"downloadable benchmark dataset and test suite","description":"Provides downloadable access to the VBench dataset including test prompts, evaluation test cases, and potentially reference videos or annotations. Enables researchers to run local evaluations, conduct custom analysis, and reproduce benchmark results. Dataset availability supports transparency and enables community contributions to benchmark development. Specific dataset composition, size, and format not documented in public materials.","intents":["I need to download the VBench test suite to evaluate my model locally","I want to analyze the benchmark dataset to understand evaluation methodology and test case design","I need to reproduce VBench results or conduct custom analysis on the benchmark data"],"best_for":["researchers conducting detailed benchmark analysis and reproduction","teams running local evaluations with custom infrastructure","developers contributing to benchmark development or proposing improvements"],"limitations":["Dataset size, format, and composition not documented","Download location and access requirements unknown","No information on licensing, usage restrictions, or attribution requirements","Specific contents of downloadable dataset (prompts only, reference videos, annotations, etc.) unclear"],"requires":["Storage capacity for benchmark dataset (size unknown)","Download access to dataset repository (location unknown)","Potentially Huggingface account or other authentication"],"input_types":["none (dataset is output)"],"output_types":["test prompts (text or image+text)","evaluation test cases (format unknown)","potentially reference videos or annotations (specifics unknown)"],"categories":["data-processing-analysis","memory-knowledge"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_11","uri":"capability://memory.knowledge.cvpr.2024.research.paper.with.detailed.methodology","name":"cvpr 2024 research paper with detailed methodology","description":"Provides comprehensive technical documentation of VBench evaluation methodology, dimension definitions, evaluation metrics, human annotation protocol, and experimental results through peer-reviewed CVPR 2024 Highlight paper. Paper serves as authoritative reference for benchmark design, validation methodology, and technical implementation details. Enables researchers to understand and reproduce benchmark methodology with full transparency.","intents":["I need to understand the detailed methodology behind VBench evaluation dimensions and metrics","I want to review the human annotation protocol and validation methodology for benchmark metrics","I need to cite and reference VBench in my research with full technical details"],"best_for":["researchers conducting detailed benchmark analysis or proposing improvements","teams implementing custom evaluation pipelines based on VBench methodology","academics publishing papers using VBench for video generation evaluation"],"limitations":["Paper access may require institutional subscription or preprint availability","Technical details in paper may not cover all implementation specifics (e.g., exact hyperparameters, code optimizations)","Paper publication date (CVPR 2024) means methodology may not reflect latest updates or improvements"],"requires":["Access to CVPR 2024 proceedings or preprint (arXiv, etc.)","Academic background to understand technical content"],"input_types":["none (paper is reference material)"],"output_types":["detailed methodology documentation","dimension definitions and evaluation metrics","human annotation protocol and validation results","experimental results and baseline model performance"],"categories":["memory-knowledge","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_12","uri":"capability://code.generation.editing.github.repository.with.evaluation.code.and.implementation","name":"github repository with evaluation code and implementation","description":"Provides open-source implementation of VBench evaluation pipeline through GitHub repository, enabling researchers to run local evaluations, understand implementation details, and contribute improvements. Repository contains evaluation code, dimension-specific metric implementations, and potentially test data. Open-source availability supports transparency, reproducibility, and community-driven benchmark development.","intents":["I need to run VBench evaluation locally on my video generation models","I want to understand the implementation details of specific evaluation dimensions","I need to contribute improvements or extensions to the VBench evaluation pipeline"],"best_for":["developers implementing local VBench evaluation","researchers extending or modifying benchmark methodology","teams integrating VBench into evaluation pipelines"],"limitations":["Implementation language and framework not specified in public materials","Code documentation and example usage not reviewed","Specific dependencies and version requirements unknown","Computational requirements for running full evaluation not documented"],"requires":["Git and GitHub access","Programming environment matching repository language/framework (unknown)","Dependencies specified in repository (unknown)","Computational resources for evaluation (requirements unknown)"],"input_types":["video files (for evaluation)","prompts and metadata (format unknown)"],"output_types":["evaluation scores and metrics","evaluation reports (format unknown)"],"categories":["code-generation-editing","tool-use-integration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_13","uri":"capability://memory.knowledge.institutional.research.collaboration.framework","name":"institutional research collaboration framework","description":"Represents collaborative research effort across multiple institutions (S-Lab at Nanyang Technological University, Shanghai Artificial Intelligence Laboratory, The Chinese University of Hong Kong, Nanjing University) combining expertise in video generation, evaluation methodology, and benchmark design. Institutional collaboration provides credibility, resources for comprehensive benchmark development, and potential for sustained maintenance and improvement. Enables access to diverse research perspectives and computational resources.","intents":["I need to understand the research institutions and expertise behind VBench development","I want to assess the credibility and sustainability of the benchmark based on institutional backing","I need to contact researchers for questions about benchmark methodology or collaboration"],"best_for":["researchers evaluating benchmark credibility and institutional support","teams considering long-term reliance on VBench for evaluation","academics interested in collaborating with VBench research team"],"limitations":["Specific roles and contributions of each institution not documented","Sustainability and maintenance commitment not explicitly stated","Potential institutional biases or conflicts of interest not addressed"],"requires":["none (informational)"],"input_types":["none"],"output_types":["institutional affiliation information","research team contact information (potentially)"],"categories":["memory-knowledge","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_2","uri":"capability://data.processing.analysis.temporal.flickering.detection.and.quantification","name":"temporal flickering detection and quantification","description":"Detects and quantifies unwanted temporal flickering, jitter, and frame-to-frame instability in generated videos using automatic objective evaluation methods. Measures the degree to which pixel values or object positions oscillate between frames in ways that violate temporal coherence. Stratified evaluation across prompt categories ensures the metric captures flickering across diverse video content types and motion patterns.","intents":["I need to measure temporal flickering artifacts in my generated videos as a distinct quality dimension","I want to identify whether my model produces jittery or unstable videos with frame-to-frame inconsistencies","I need to quantify flickering separately from motion smoothness to diagnose specific temporal artifacts"],"best_for":["video generation model developers optimizing for temporal stability","teams evaluating diffusion-based or autoregressive video models prone to flickering","researchers studying temporal coherence in generative video"],"limitations":["Specific flickering detection method not documented (optical flow variance, pixel-level variance, frequency analysis, etc.)","Threshold for what constitutes 'acceptable' flickering not specified","No information on how flickering is distinguished from intentional motion or camera shake"],"requires":["Generated video files","VBench evaluation pipeline with temporal flickering module"],"input_types":["video files (generated videos)"],"output_types":["temporal flickering score (0-1 or 0-100 scale, specifics unknown)","per-frame or per-region flickering metrics (if available)"],"categories":["data-processing-analysis","image-visual"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_3","uri":"capability://data.processing.analysis.motion.smoothness.and.optical.flow.quality.assessment","name":"motion smoothness and optical flow quality assessment","description":"Evaluates the smoothness and naturalness of motion in generated videos by analyzing optical flow patterns and motion trajectories across frames. Measures whether motion is fluid and physically plausible rather than jerky, unrealistic, or discontinuous. Uses automatic objective evaluation methods (likely optical flow computation and trajectory analysis, specifics unknown) stratified across prompt categories to ensure motion quality is assessed across diverse motion types and speeds.","intents":["I need to measure whether my video generation model produces smooth, natural motion rather than jerky or unrealistic movement","I want to quantify motion quality as a distinct dimension separate from subject consistency or temporal stability","I need to identify if my model struggles with specific motion types (fast motion, complex trajectories, etc.)"],"best_for":["video generation developers optimizing for motion naturalness","teams evaluating action-driven or dynamic scene generation","researchers studying motion realism in generative video"],"limitations":["Specific optical flow method and motion quality metrics not documented","Definition of 'smooth' motion and how it accounts for intentional motion variation unclear","No information on how static scenes (minimal motion) are evaluated"],"requires":["Generated video files with motion","VBench evaluation pipeline with motion smoothness module"],"input_types":["video files (generated videos with motion)"],"output_types":["motion smoothness score (0-1 or 0-100 scale, specifics unknown)","optical flow statistics or trajectory smoothness metrics (if available)"],"categories":["data-processing-analysis","image-visual"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_4","uri":"capability://data.processing.analysis.aesthetic.quality.and.visual.appeal.scoring","name":"aesthetic quality and visual appeal scoring","description":"Measures the aesthetic quality, visual appeal, and production-value of generated videos using automatic objective evaluation methods. Evaluates factors such as color grading, lighting, composition, and overall visual polish. Stratified across prompt categories to ensure aesthetic assessment captures quality across diverse visual styles and content types. Likely uses perceptual quality metrics (BRISQUE, NIQE, or similar) adapted for video, though specific methods unknown.","intents":["I need to measure the aesthetic quality and visual appeal of my generated videos","I want to quantify production-value and visual polish as a distinct quality dimension","I need to identify if my model generates visually appealing content across diverse visual styles"],"best_for":["video generation developers optimizing for visual quality and production value","teams generating content for creative or commercial applications","researchers studying aesthetic quality in generative video"],"limitations":["Specific aesthetic quality metrics not documented (perceptual quality measures, color analysis, composition metrics, etc.)","Definition of 'aesthetic quality' and how it accounts for stylistic variation unclear","Potential cultural or subjective bias in aesthetic evaluation not addressed"],"requires":["Generated video files","VBench evaluation pipeline with aesthetic quality module"],"input_types":["video files (generated videos)"],"output_types":["aesthetic quality score (0-1 or 0-100 scale, specifics unknown)","per-frame or per-region aesthetic metrics (if available)"],"categories":["data-processing-analysis","image-visual"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_5","uri":"capability://data.processing.analysis.text.video.semantic.alignment.evaluation","name":"text-video semantic alignment evaluation","description":"Measures how well generated videos semantically align with and accurately represent the text prompts that guided their generation. Evaluates whether the video content matches the prompt's intent, includes described objects/actions, and captures the semantic meaning of the text. Uses automatic objective evaluation methods (likely CLIP-based text-image/video similarity, specifics unknown) stratified across prompt categories to ensure alignment is assessed across diverse prompt types and content domains.","intents":["I need to measure how well my text-to-video model generates videos that match the input prompts","I want to quantify semantic alignment between prompts and generated videos as a distinct quality dimension","I need to identify if my model struggles with specific prompt types or semantic concepts"],"best_for":["text-to-video model developers optimizing for prompt adherence","teams evaluating instruction-following in video generation","researchers studying semantic alignment in generative video"],"limitations":["Specific text-video alignment metric not documented (CLIP similarity, semantic matching, object detection, etc.)","Definition of 'alignment' and how it handles ambiguous or multi-interpretation prompts unclear","No information on how alignment is measured for complex, multi-part prompts"],"requires":["Generated video files","Original text prompts used to generate videos","VBench evaluation pipeline with text-video alignment module"],"input_types":["video files (generated videos)","text prompts (original prompts used for generation)"],"output_types":["text-video alignment score (0-1 or 0-100 scale, specifics unknown)","per-concept or per-object alignment metrics (if available)"],"categories":["data-processing-analysis","text-generation-language"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_6","uri":"capability://data.processing.analysis.stratified.evaluation.across.diverse.prompt.categories","name":"stratified evaluation across diverse prompt categories","description":"Organizes benchmark evaluation across multiple diverse prompt categories (specific categories unknown) to ensure video generation quality is assessed across different content types, visual styles, and semantic domains. Each of the 16 evaluation dimensions is applied within each category, creating a matrix of dimension × category evaluations. This stratification enables identification of category-specific model strengths and weaknesses rather than relying on aggregate scores that may mask performance variation across content types.","intents":["I need to understand how my video generation model performs across different content types and prompt categories","I want to identify if my model excels in some categories but struggles in others","I need to ensure my model generalizes across diverse video generation scenarios, not just specific content types"],"best_for":["video generation developers evaluating model robustness across content diversity","teams assessing generalization of video generation models","researchers studying category-specific performance in generative video"],"limitations":["Specific prompt categories not documented in public materials — requires consulting full paper for category definitions","Number of prompts per category and total test set size unknown","No information on how categories were selected or whether they represent real-world usage distribution"],"requires":["Generated videos across diverse prompt categories","VBench evaluation pipeline with category stratification"],"input_types":["video files (generated videos from diverse prompt categories)"],"output_types":["per-category evaluation results (dimension scores for each category)","category-specific performance breakdown","aggregate scores across categories"],"categories":["data-processing-analysis","testing-quality"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_7","uri":"capability://data.processing.analysis.human.preference.annotation.and.alignment.validation","name":"human preference annotation and alignment validation","description":"Incorporates human preference annotation of generated videos across evaluation dimensions to validate that automatic evaluation metrics align with human perception and preferences. Annotators evaluate generated videos and provide preference judgments, which are then correlated with automatic metric scores to assess metric validity. Claims that VBench evaluation results are well-aligned with human perceptions, though specific validation methodology, inter-rater agreement, and sample sizes are not documented in public materials.","intents":["I need to verify that VBench's automatic metrics correlate with human preferences and perception","I want to ensure that optimizing for VBench scores actually improves human-perceived video quality","I need confidence that the benchmark captures what humans actually care about in video generation"],"best_for":["researchers validating benchmark metrics against human judgment","teams assessing whether automatic evaluation aligns with real-world user preferences","developers deciding whether to optimize for VBench scores"],"limitations":["Specific human annotation protocol not documented — sample size, inter-rater agreement, annotation guidelines unknown","Correlation coefficients between automatic metrics and human preferences not provided in public materials","No information on potential annotator bias or demographic representation in human evaluation","Validation methodology and statistical significance testing not disclosed"],"requires":["Generated video files for human annotation","Human annotators (number and selection criteria unknown)","Annotation interface and guidelines (specifics unknown)"],"input_types":["video files (generated videos for human evaluation)"],"output_types":["human preference judgments (specifics unknown)","correlation analysis between automatic metrics and human preferences (not publicly disclosed)"],"categories":["data-processing-analysis","safety-moderation"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_8","uri":"capability://data.processing.analysis.vbench.image.to.video.evaluation.with.adaptive.image.suite","name":"vbench+ image-to-video evaluation with adaptive image suite","description":"Extended variant (VBench+) that evaluates image-to-video generation models in addition to text-to-video systems. Introduces an 'adaptive Image Suite' of test cases specifically designed for image-to-video evaluation, enabling assessment of how well models preserve image content while generating coherent video continuations. Applies the same 16-dimensional evaluation framework to image-to-video generation, stratified across prompt categories to ensure comprehensive coverage of image-to-video scenarios.","intents":["I need to benchmark my image-to-video generation model using the same comprehensive evaluation framework as text-to-video","I want to measure how well my model preserves image content while generating coherent video continuations","I need to evaluate image-to-video quality across the same 16 dimensions as text-to-video for comparative analysis"],"best_for":["image-to-video model developers evaluating generation quality","teams comparing text-to-video and image-to-video model performance","researchers studying image-to-video generation quality"],"limitations":["Specific design of 'adaptive Image Suite' not documented — how test images are selected or generated unknown","Differences in evaluation methodology between text-to-video and image-to-video dimensions not specified","No information on how image preservation is weighted relative to video quality dimensions"],"requires":["Generated video files from image-to-video model","Original image inputs and prompts","VBench+ evaluation pipeline (distinct from original VBench)"],"input_types":["video files (generated by image-to-video model)","image files (original image inputs)"],"output_types":["per-dimension quality scores for image-to-video generation","image preservation metrics (specifics unknown)","overall image-to-video quality assessment"],"categories":["data-processing-analysis","image-visual"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__cap_9","uri":"capability://tool.use.integration.huggingface.demo.interface.for.interactive.evaluation","name":"huggingface demo interface for interactive evaluation","description":"Provides a web-based interactive demo interface hosted on Huggingface Spaces, enabling users to upload or generate videos and receive VBench evaluation scores without local setup. The demo abstracts away implementation details and provides a user-friendly interface for accessing the benchmark evaluation pipeline. Enables non-technical users and researchers to evaluate videos without installing dependencies or running code locally.","intents":["I want to evaluate my generated videos using VBench without setting up the code locally","I need a quick way to test video generation quality without installing dependencies","I want to explore VBench evaluation results interactively without command-line usage"],"best_for":["researchers and developers wanting quick VBench evaluation without local setup","non-technical users exploring video generation quality assessment","teams prototyping video generation models and needing rapid evaluation feedback"],"limitations":["Specific demo features and interface design not documented","Computational limitations of web-based demo unknown — may have file size or processing time constraints","No information on whether demo provides full 16-dimensional evaluation or subset of metrics","Availability and uptime of Huggingface-hosted demo not guaranteed"],"requires":["Web browser with internet access","Video files to evaluate (format and size limits unknown)","Huggingface Spaces account (may be optional)"],"input_types":["video files (uploaded or generated within demo)"],"output_types":["VBench evaluation scores (format and completeness unknown)","Interactive visualization of results (specifics unknown)"],"categories":["tool-use-integration","automation-workflow"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"vbench__headline","uri":"capability://testing.quality.video.generation.quality.benchmark","name":"video generation quality benchmark","description":"VBench is a comprehensive benchmark for evaluating video generation quality across 16 critical dimensions, helping developers assess model performance in areas like subject consistency and aesthetic quality.","intents":["best video generation benchmark","video quality evaluation for AI models","how to benchmark video generation","top tools for evaluating video generation quality","video generation assessment framework"],"best_for":["AI researchers","developers in video generation"],"limitations":["does not specify scoring metrics","lacks detailed implementation guidance"],"requires":[],"input_types":[],"output_types":[],"categories":["testing-quality"],"confidence":0.5,"matches":0,"success_rate":0}],"trust":{"score":62,"verified":false,"data_access_risk":"high","permissions":["Generated video files from a text-to-video or image-to-video model","Original prompts (text or image+text pairs) used to generate videos","Access to VBench evaluation code from GitHub repository (language/framework specifics unknown)","Sufficient computational resources to run automatic evaluation pipelines (exact requirements unknown)","Generated video files with identifiable subjects","Original prompts describing the subject","VBench evaluation pipeline with subject consistency module","Storage capacity for benchmark dataset (size unknown)","Download access to dataset repository (location unknown)","Potentially Huggingface account or other authentication"],"failure_modes":["Specific evaluation metrics per dimension not fully documented in public materials — requires consulting full CVPR 2024 paper for implementation details","Exact test set size and composition unknown — documentation states 'diverse prompt categories' but specific category definitions and prompt counts not provided","No public leaderboard or submission mechanism documented — benchmark appears designed for research evaluation rather than continuous model ranking","Human alignment validation methodology unclear — claims results align with human perception but inter-rater agreement coefficients and sample sizes not disclosed","Computational cost and runtime requirements for full benchmark evaluation not specified","Specific evaluation method (CLIP similarity, optical flow, face detection, etc.) not documented","Definition of 'subject' and how it handles multi-subject videos unclear","No information on how consistency is measured across videos of different lengths or frame rates","Dataset size, format, and composition not documented","Download location and access requirements unknown","builder identity is not verified yet","no observed match outcomes yet"],"rank_breakdown":{"adoption":0.7,"quality":0.9,"ecosystem":0.3,"match_graph":0.25,"freshness":0.75,"weights":{"adoption":0.25,"quality":0.35,"ecosystem":0.15,"match_graph":0.2,"freshness":0.05}},"observed_outcomes":{"matches":0,"success_rate":0,"avg_confidence":0,"top_intents":[],"last_matched_at":null},"maintenance":{"status":"active","updated_at":"2026-05-24T12:16:34.118Z","last_scraped_at":null,"last_commit":null},"community":{"stars":null,"forks":null,"weekly_downloads":null,"model_downloads":null,"model_likes":null}},"distribution":{"claim_url":"https://unfragile.ai/submit?claim=vbench","compare_url":"https://unfragile.ai/compare?artifact=vbench"}},"signature":"4ecxG+2kJVSx6y0IrkHlWUZiis5fg1SkfJPMLVsGG+UuxOQxRwt8OtvD7X/ToBmY/yCN2IOgJj3Jt+ZfcFApBw==","signedAt":"2026-06-20T12:07:55.976Z","signedBy":"unfragile.ai","version":1},"_links":{"self":"https://unfragile.ai/api/v1/passport/vbench","artifact":"https://unfragile.ai/vbench","verify":"https://unfragile.ai/api/v1/verify?slug=vbench","publicKey":"https://unfragile.ai/api/v1/trust-passport-public-key","spec":"https://unfragile.ai/trust","schema":"https://unfragile.ai/schema.json","docs":"https://unfragile.ai/docs"}}