Capability

Quantization Aware Adapter Training With Frozen Base Weights

20 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

via “model quantization for memory and latency reduction”

text-generation model by undefined. 1,42,05,413 downloads.

Unique: Supports both post-training quantization (no retraining) via bitsandbytes and quantization-aware training (better accuracy) via torch.quantization, with automatic calibration dataset selection for minimal accuracy loss

vs others: Faster and simpler than knowledge distillation (which requires training a smaller model), but less accurate than distillation for extreme compression — best for 2-4x size reduction, not 10x+

Quantization Aware Adapter Training With Frozen Base Weights

Top Matches

Also Known As

Company