Capability
Mmlu Benchmark Performance At 77 8 Accuracy
7 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “mmlu benchmark performance at 77.8% accuracy”
Mistral's mixture-of-experts model with 176B total parameters.
Unique: 77.8% MMLU performance achieved through sparse MoE architecture with selective expert activation, enabling knowledge-specialized experts to activate for different subject domains. This allows efficient knowledge coverage without requiring full model capacity for every question.
vs others: Competitive with other open-weight models on MMLU; lower than proprietary models (GPT-4, Claude 3) but higher than smaller open models (LLaMA 2 13B-34B); sparse activation enables this performance with lower inference cost than dense 70B models