Capability

Reinforcement Learning From Human Feedback Rlhf Training

20 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

via “instruction-following and rlhf-aligned response generation”

text-generation model by undefined. 36,81,247 downloads.

Unique: RLHF training on 120B-parameter model provides instruction-following quality comparable to GPT-3.5 while remaining fully open-source. Alignment training includes explicit refusal behavior for harmful requests without requiring external content filters.

vs others: Better instruction-following than base Llama 2 70B; comparable to Mistral 7B instruction model but at significantly larger scale, enabling more complex reasoning and longer context handling

Reinforcement Learning From Human Feedback Rlhf Training

Top Matches

Also Known As

Company