Capability
Multilingual Safety Classification With Machine Translated Benchmarks
12 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “multilingual safety classification with machine-translated benchmarks”
Meta's LLM safety classifier for content policy enforcement.
Unique: Llama Guard is evaluated against CyberSecEval's machine-translated multilingual benchmark datasets, providing structured coverage of safety risks across languages rather than relying on a single English-trained model applied to translated text.
vs others: More comprehensive than language-agnostic classifiers because it's explicitly tested on multilingual adversarial content, though performance gaps between languages remain due to translation quality and training data imbalance