Capability
Multi Model Inference With Jamba Family Variants
3 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “parameter-efficient inference with mixture-of-experts-style sparsity”
AI21's hybrid Mamba-Transformer model with 256K context.
Unique: Uses sparse activation with only 12B-94B active parameters out of 52B-398B total through hybrid Mamba-Transformer design, reducing inference cost vs. dense models while maintaining quality
vs others: Achieves inference efficiency comparable to quantized or pruned models while maintaining full precision, and uses fewer active parameters than dense alternatives of similar quality