Onnx And Torchscript Export For Cross Platform Deployment

1

ONNX Runtime MobileFramework60/100

via “model conversion and import from multiple frameworks”

Cross-platform ONNX inference for mobile devices.

Unique: Provides unified ONNX target format across multiple training frameworks, enabling a single deployment pipeline regardless of training framework. This is more flexible than framework-specific deployment (e.g., TensorFlow Lite for TensorFlow, PyTorch Mobile for PyTorch) because ONNX is framework-agnostic.

vs others: More flexible than TensorFlow Lite because it supports PyTorch, scikit-learn, and other frameworks; more portable than PyTorch Mobile because ONNX models run on iOS, Android, and server platforms without modification.

2

Kokoro TTSRepository59/100

via “model export and optimization for production deployment”

Lightweight 82M parameter open-source TTS with high-quality output.

Unique: Provides explicit export utilities rather than automatic ONNX export, giving developers control over export parameters and optimization settings; separates export from inference, enabling offline optimization workflows

vs others: More flexible than automatic export because developers can customize export parameters; avoids runtime overhead of on-demand export compared to systems that export during first inference

3

Detectron2Repository58/100

via “multi-format model export for deployment (torchscript, onnx, caffe2)”

Meta's modular object detection platform on PyTorch.

Unique: Supports three deployment formats (TorchScript, ONNX, Caffe2) with automatic input/output shape inference and format-specific optimizations, enabling deployment across heterogeneous inference platforms — unlike frameworks that support only a single export format

vs others: More flexible than TensorFlow's SavedModel because it supports multiple export targets; more production-ready than raw PyTorch models because exported models have no Detectron2 dependencies and can be optimized for specific inference hardware

4

Piper TTSRepository58/100

via “onnx model export and optimization for edge deployment”

Fast local neural TTS optimized for Raspberry Pi and edge devices.

Unique: Implements ONNX export with built-in quantization and operator fusion specifically tuned for VITS architecture, enabling 50-70% model size reduction with minimal quality loss vs. generic ONNX converters

vs others: More optimized for TTS than generic ONNX export tools; supports quantization strategies specific to VITS; produces models 2-3x smaller than unoptimized exports while maintaining quality

5

xlm-roberta-baseModel55/100

via “onnx model export and optimized inference”

fill-mask model by undefined. 1,81,65,674 downloads.

Unique: Provides native ONNX export support via HuggingFace Transformers, enabling single-command conversion to hardware-agnostic format with built-in optimization profiles for CPU, GPU, and mobile inference — unlike manual ONNX conversion which requires deep knowledge of ONNX IR and operator semantics

vs others: Reduces deployment complexity and inference latency compared to PyTorch/TensorFlow serving by eliminating framework dependencies and enabling aggressive quantization/pruning, while maintaining model accuracy through ONNX Runtime's operator fusion and memory optimization

6

mobilenetv3_small_100.lamb_in1kModel54/100

via “model-export-and-format-conversion”

image-classification model by undefined. 2,28,10,638 downloads.

Unique: timm provides unified export utilities (timm.models.convert_to_onnx, timm.models.convert_to_tflite) that handle operator fusion, constant folding, and shape inference automatically. The export pipeline supports quantization-aware export, enabling int8 models without separate QAT. ONNX export includes graph optimization via onnx-simplifier, reducing model size by 10-20% and improving inference speed.

vs others: Automated export pipeline eliminates manual operator mapping and shape inference errors; supports more target formats (ONNX, TFLite, CoreML, NCNN, TorchScript) than single-framework converters, reducing conversion complexity.

7

bge-base-en-v1.5Model54/100

via “onnx-export-and-cpu-inference”

feature-extraction model by undefined. 81,55,394 downloads.

Unique: BGE-base-en-v1.5 provides official ONNX exports with optimized graph structure for inference runtimes, enabling sub-100ms CPU inference on modern processors and enabling deployment on edge devices without PyTorch or GPU requirements

vs others: Faster CPU inference than PyTorch eager execution and more portable than TorchScript for cross-platform deployment; enables embedding generation on edge devices where PyTorch is too heavy

8

all-MiniLM-L12-v2Model54/100

via “multi-format-model-export-and-deployment”

sentence-similarity model by undefined. 28,25,304 downloads.

Unique: Provides native export to four distinct inference formats with automatic tokenizer serialization and config preservation, enabling single-command deployment across CPU, GPU, mobile, and edge hardware without manual format conversion or architecture reimplementation; SafeTensors format ensures secure deserialization preventing arbitrary code execution

vs others: More deployment-flexible than OpenAI embeddings (API-only); simpler than custom ONNX conversion pipelines; safer than pickle-based PyTorch exports due to SafeTensors format

9

ChatTTSAgent53/100

via “onnx export for cross-platform deployment”

A generative speech model for daily dialogue.

Unique: Provides ONNX export capability for all major pipeline components (GPT, DVAE, Vocos), enabling end-to-end deployment without PyTorch. The export process includes optimization and quantization options, enabling deployment on resource-constrained devices.

vs others: More flexible than PyTorch-only deployment because ONNX enables use of alternative inference runtimes (ONNX Runtime, TensorRT, CoreML). More portable than TorchScript because ONNX is a standard format with broad ecosystem support.

10

table-transformer-detectionModel53/100

via “onnx model export for edge deployment and inference optimization”

object-detection model by undefined. 33,94,499 downloads.

Unique: Provides transformer-aware ONNX export that preserves attention mechanism semantics while enabling quantization-friendly operator fusion. The export pipeline includes automatic calibration for INT8 quantization using representative document images, reducing manual tuning overhead.

vs others: More portable than TensorFlow Lite or CoreML because ONNX Runtime runs on Windows, Linux, macOS, iOS, and Android with identical inference results; achieves better accuracy-latency tradeoffs than naive INT8 quantization due to transformer-specific calibration strategies.

11

multi-qa-mpnet-base-dot-v1Model53/100

via “onnx-and-openvino-export-for-edge-deployment”

sentence-similarity model by undefined. 25,30,482 downloads.

Unique: Provides native ONNX and OpenVINO export support with quantization-friendly architecture (no custom ops). Enables deployment on edge devices and CPU-only infrastructure with minimal code changes, supporting both float32 and int8 quantized inference.

vs others: Faster edge deployment than PyTorch models because ONNX Runtime and OpenVINO use optimized inference engines with hardware-specific optimizations, and quantization support reduces model size by 4x and latency by 2-3x compared to full-precision models.

12

distil-large-v3Model51/100

via “onnx-export-and-cross-platform-inference”

automatic-speech-recognition model by undefined. 13,05,832 downloads.

Unique: Leverages ONNX's standardized opset to enable deployment across 10+ platforms (Windows, Linux, macOS, iOS, Android, web browsers, embedded systems) with a single model export — ONNX Runtime's execution providers automatically select optimal hardware acceleration (CPU, GPU, CoreML, NNAPI) without code changes

vs others: Enables true cross-platform deployment with a single model file, unlike PyTorch Mobile (iOS/Android only) or TensorFlow Lite (mobile-focused); ONNX Runtime's graph optimizations often match or exceed framework-native inference speed while providing broader platform coverage

13

multilingual-e5-baseModel51/100

via “onnx and openvino model export for edge deployment”

sentence-similarity model by undefined. 36,60,082 downloads.

Unique: Supports three inference backends (PyTorch, ONNX Runtime, OpenVINO) from a single model artifact, with automatic optimization for each target platform — ONNX for cross-platform compatibility, OpenVINO for Intel hardware, PyTorch for development

vs others: More portable than PyTorch-only deployment and faster than unoptimized ONNX due to OpenVINO's graph-level optimizations; enables 2-4x latency reduction on CPU compared to PyTorch inference

14

bert-base-NERModel50/100

via “onnx export for edge deployment and inference optimization”

token-classification model by undefined. 18,11,113 downloads.

Unique: Supports ONNX export via transformers' built-in export utilities, enabling deployment on ONNX Runtime which provides hardware-specific optimizations (graph fusion, operator fusion, quantization) without retraining. ONNX models are framework-agnostic and can run on CPU, GPU, or specialized accelerators (NPU, TPU) via different ONNX Runtime backends.

vs others: Faster and smaller than PyTorch checkpoints due to graph optimization, and more portable than TensorFlow SavedModel, but requires additional conversion step and validation compared to native PyTorch deployment.

15

e5-base-v2Model50/100

via “onnx and openvino model export for edge and on-premise deployment”

sentence-similarity model by undefined. 17,78,169 downloads.

Unique: Provides native ONNX and OpenVINO export through sentence-transformers' built-in conversion utilities, supporting both full-precision and quantized models without custom export code. The export process preserves the tokenizer and preprocessing logic, enabling end-to-end inference without reimplementing text preprocessing.

vs others: One-command export to multiple formats (ONNX, OpenVINO) with quantization support, whereas most models require separate conversion pipelines and manual tokenizer integration for edge deployment.

16

ModernBERT-baseModel49/100

via “onnx and safetensors export for cross-platform deployment”

fill-mask model by undefined. 13,80,835 downloads.

Unique: Provides first-class ONNX and SafeTensors support in the HuggingFace model card with pre-converted weights, eliminating the need for custom export scripts and enabling one-click deployment to ONNX Runtime, TensorRT, or CoreML without PyTorch dependency

vs others: Faster and more secure than pickle-based PyTorch exports (SafeTensors), and more portable than PyTorch-only models while maintaining compatibility with standard BERT fine-tuning workflows

17

mDeBERTa-v3-base-xnli-multilingual-nli-2mil7Model48/100

via “onnx-model-export-and-inference”

zero-shot-classification model by undefined. 3,03,704 downloads.

Unique: Enables ONNX export of the DeBERTa-v3-base architecture with full transformer semantics preserved, supporting dynamic batch sizes and sequence lengths without reexport. Unlike simple PyTorch-to-ONNX conversion, this approach maintains cross-lingual capabilities and NLI reasoning patterns across different runtime environments.

vs others: Provides hardware-agnostic inference without PyTorch dependency, enabling 2-5x faster startup and lower memory overhead than PyTorch on CPU, and supports quantization for 4x model size reduction with minimal accuracy loss vs full-precision models.

18

RMBG-1.4Model48/100

via “onnx-based cross-platform inference without pytorch dependency”

image-segmentation model by undefined. 10,16,325 downloads.

Unique: Pre-exported ONNX model with inference-specific optimizations (operator fusion, memory layout optimization) reduces model size and latency compared to PyTorch eager execution; eliminates PyTorch dependency entirely, enabling deployment to platforms where PyTorch is unavailable or impractical

vs others: Smaller model size and faster inference than PyTorch on CPU; broader platform support than PyTorch Mobile (which is iOS/Android only); ONNX Runtime is more mature and widely supported than alternative inference engines like TensorFlow Lite for this use case

19

BiRefNetModel48/100

via “onnx export for cross-platform deployment”

image-segmentation model by undefined. 9,21,132 downloads.

Unique: Enables ONNX export of the bidirectional refinement architecture, preserving the multi-scale feature fusion and iterative refinement semantics in ONNX IR format, allowing deployment on non-PyTorch platforms while maintaining segmentation quality

vs others: Broader deployment flexibility than PyTorch-only models; ONNX Runtime provides faster CPU inference and better mobile/edge device support than PyTorch Mobile, though with some accuracy trade-off in quantized versions

20

mask2former-swin-large-cityscapes-semanticModel46/100

via “model export to onnx and torchscript formats”

image-segmentation model by undefined. 1,55,904 downloads.

Unique: Supports export to both ONNX and TorchScript, enabling deployment across diverse inference engines (ONNX Runtime, TensorRT, CoreML) — though deformable attention may require custom ONNX operators not available in standard opset

vs others: Enables multi-platform deployment vs PyTorch-only inference, though export complexity and potential operator compatibility issues add deployment friction

Top Matches

Also Known As

Company