Capability
2 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →via “professional domain-specific knowledge evaluation (medical, finance, law, administrative)”
ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括374个大模型,覆盖chatgpt、gpt-5.4、谷歌gemini-3.1-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3.6-max、qwen3.6-plus、百川、讯飞星火、商汤senseChat等商用模型, 以及step3.5-flash、kimi-k2.6、ernie4.5、MiniMax-M2.7、deepseek-v4、Qwen3.6、llama4、智谱GLM-5.1、MiMo-V2、LongCat、gemma4、mistral等开源大模型。不仅提供排行榜,也提供规模超200万的大
Unique: Evaluates four professional domains (Medical, Finance, Law, Administrative) using domain-expert-designed test questions with realistic scenarios (medical case studies, financial analysis, legal document interpretation) rather than generic knowledge questions. Incorporates domain-specific scoring rubrics reflecting professional standards and best practices. Enables cross-domain comparison to identify models suitable for professional applications.
vs others: More specialized domain assessment than general benchmarks (MMLU, C-Eval) and realistic professional scenarios vs academic knowledge questions
via “domain-specific knowledge application and reasoning”
Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in...
Unique: Trained on domain-specific corpora and professional standards (financial regulations, medical literature, legal precedents), enabling reasoning that incorporates industry best practices without explicit fine-tuning
vs others: Outperforms general-purpose models on domain-specific tasks due to specialized training data, while maintaining flexibility across multiple domains unlike single-domain specialized models
Building an AI tool with “Professional Domain Specific Knowledge Evaluation Medical Finance Law Administrative”?
Submit your artifact →curl unfragile.ai/agents.md | sh© 2026 Unfragile. The platform for software for agents.