Efficient Inference With Mixed Precision Support

1

blip-image-captioning-largeModel51/100

via “efficient inference via model quantization and mixed-precision execution”

image-to-text model by undefined. 8,69,610 downloads.

Unique: Integrates with bitsandbytes for seamless int8 quantization without manual calibration; supports both PyTorch and TensorFlow backends. Quantization is applied transparently via the transformers API without modifying model code.

vs others: Easier to use than manual quantization with ONNX or TensorRT; automatic calibration eliminates the need for representative datasets.

2

imagen-pytorchFramework51/100

via “mixed precision training with automatic loss scaling”

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

Unique: Integrates Accelerate's mixed precision with automatic loss scaling, handling precision casting and numerical stability without manual configuration

vs others: Provides automatic mixed precision with loss scaling through Accelerate, reducing boilerplate compared to manual precision management while maintaining numerical stability

3

oneformer_coco_swin_largeModel39/100

via “efficient-inference-with-mixed-precision-support”

image-segmentation model by undefined. 54,407 downloads.

Unique: Supports both FP16 and BF16 precision with automatic mixed precision (AMP) that selectively casts operations based on numerical stability requirements. The model architecture is designed to be numerically stable in lower precision, with careful attention to softmax and normalization operations.

vs others: Achieves 1.8-2.2× inference speedup with <1% accuracy loss using FP16 on NVIDIA GPUs, outperforming quantization-based approaches that typically require post-training quantization and calibration.

Top Matches

Also Known As

Company