Capability

Cost Optimized Reasoning Inference At 32b Scale

20 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

via “parameter-efficient reasoning through rl scaling”

Alibaba's 32B reasoning model with chain-of-thought.

Unique: Achieves reasoning performance comparable to 671B-parameter models through RL scaling on robust foundation models with outcome-based verification, demonstrating parameter-efficient reasoning through training approach rather than architectural compression

vs others: Delivers reasoning capability at 32B parameters competitive with 671B+ parameter models through RL training efficiency, enabling cost-effective and resource-efficient reasoning deployment compared to larger models

Cost Optimized Reasoning Inference At 32b Scale

Top Matches

Also Known As

Company