Capability
2 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →via “multi-dimensional trustworthiness evaluation across 6 core dimensions”
8-dimension trustworthiness benchmark for LLMs.
Unique: Combines 6 orthogonal trustworthiness dimensions (not just safety or factuality) with 30+ datasets and mixed evaluation strategies (pattern matching, LLM-as-judge, deterministic metrics, external APIs). Supports both online and local model backends with unified configuration, enabling fair comparison across proprietary and open-source models in a single benchmark run.
vs others: More comprehensive than single-dimension benchmarks (e.g., TruthfulQA for truthfulness only) and more accessible than custom evaluation pipelines because it bundles datasets, evaluators, and reporting in one framework.
via “dimension-based trust evaluation”
Reputation scoring for AI agent wallets on Base L2. Check trust scores (0-100) across 5 dimensions before transacting with autonomous agents. Free tier available.
Unique: Employs a multi-faceted evaluation approach that combines qualitative and quantitative metrics, setting it apart from simpler models that may overlook critical factors.
vs others: Offers a more detailed analysis than alternatives that focus on a single trust metric, providing a richer context for decision-making.
Building an AI tool with “Multi Dimensional Trustworthiness Evaluation Across 6 Core Dimensions”?
Submit your artifact →curl unfragile.ai/agents.md | sh© 2026 Unfragile. The platform for software for agents.