Capability

Inference Optimization Via Mixed Precision Computation

4 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

via “efficient inference via model quantization and mixed-precision execution”

image-to-text model by undefined. 14,17,263 downloads.

Unique: Integrates with bitsandbytes for seamless int8 quantization without manual calibration; supports both PyTorch and TensorFlow backends. Quantization is applied transparently via the transformers API without modifying model code.

vs others: Easier to use than manual quantization with ONNX or TensorRT; automatic calibration eliminates the need for representative datasets.

Inference Optimization Via Mixed Precision Computation

Top Matches

Also Known As

Company