Capability
Cost Optimized Reasoning Inference At 32b Scale
20 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “parameter-efficient reasoning through rl scaling”
Alibaba's 32B reasoning model with chain-of-thought.
Unique: Achieves reasoning performance comparable to 671B-parameter models through RL scaling on robust foundation models with outcome-based verification, demonstrating parameter-efficient reasoning through training approach rather than architectural compression
vs others: Delivers reasoning capability at 32B parameters competitive with 671B+ parameter models through RL training efficiency, enabling cost-effective and resource-efficient reasoning deployment compared to larger models