Meta: Llama Guard 4 12BModel23/100 via “taxonomy-based unsafe content categorization”
Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...
Unique: Uses instruction-tuned fine-tuning on safety-labeled data to produce multi-dimensional category scores in a single forward pass, rather than training separate binary classifiers per category or using rule-based heuristics. Inherits Llama Guard 3's taxonomy design but extends it with visual understanding.
vs others: Provides granular per-category scores in one API call, enabling policy-based routing, whereas binary classifiers (safe/unsafe) require downstream logic to determine which violation type occurred, and rule-based systems are brittle to paraphrasing.