Capability
Reinforcement Learning From Human Feedback Rlhf Training
20 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “instruction-following and rlhf-aligned response generation”
text-generation model by undefined. 36,81,247 downloads.
Unique: RLHF training on 120B-parameter model provides instruction-following quality comparable to GPT-3.5 while remaining fully open-source. Alignment training includes explicit refusal behavior for harmful requests without requiring external content filters.
vs others: Better instruction-following than base Llama 2 70B; comparable to Mistral 7B instruction model but at significantly larger scale, enabling more complex reasoning and longer context handling