Real Time Object Detection With Yolo Models

1

MediaPipeFramework58/100

via “object detection with bounding box localization”

Google's cross-platform on-device ML framework with pre-built solutions.

Unique: Provides unified object detection API across Android, iOS, Web, and Python with built-in support for multiple pre-trained models (COCO, Open Images) and custom model fine-tuning via Model Maker; uses hardware acceleration (GPU/NPU) on mobile platforms for real-time inference.

vs others: More mobile-optimized and faster than TensorFlow Object Detection API on edge devices, includes built-in model customization via Model Maker unlike many pre-trained-only alternatives, but less feature-rich than specialized object detection frameworks like YOLOv8 or Faster R-CNN.

2

YOLOv8Repository55/100

via “real-time object detection model”

Real-time object detection, segmentation, and pose.

Unique: YOLOv8 combines speed and accuracy with a simple Python API and extensive export formats, setting it apart from other models.

vs others: YOLOv8 offers superior performance in real-time applications compared to traditional object detection frameworks.

3

UltralyticsRepository55/100

via “unified api for yolo-based computer vision tasks”

Unified YOLO framework for detection and segmentation.

Unique: Ultralytics stands out with its comprehensive support for various computer vision tasks through a single, cohesive API.

vs others: Compared to other frameworks, Ultralytics offers a more streamlined and user-friendly approach to implementing YOLO models for diverse applications.

4

rtdetr_r18vd_coco_o365Model42/100

via “real-time object detection with transformer-based architecture”

object-detection model by undefined. 5,21,638 downloads.

Unique: Uses transformer-based detection with anchor-free, NMS-free design (RT-DETR architecture) instead of traditional Faster R-CNN/YOLO CNN pipelines; eliminates hand-crafted anchor definitions and post-processing NMS, enabling end-to-end optimization and faster convergence during training

vs others: Faster inference than DETR variants and comparable to YOLOv8 while maintaining transformer interpretability; outperforms ResNet-50 Faster R-CNN on COCO at similar latency due to efficient attention mechanisms

5

yolov10sModel41/100

via “real-time multi-scale object detection with anchor-free architecture”

object-detection model by undefined. 2,23,706 downloads.

Unique: YOLOv10 introduces an anchor-free detection head with NMS-free training, eliminating the need for hand-crafted anchor boxes and post-processing NMS operations. This architectural shift reduces hyperparameter tuning surface and improves inference speed by ~20% vs YOLOv8 while maintaining competitive accuracy on COCO.

vs others: Faster than Faster R-CNN (two-stage) for real-time use cases and simpler to deploy than EfficientDet due to anchor-free design requiring no anchor configuration; trades some precision on tiny objects vs Mask R-CNN for speed-critical applications.

6

yolos-tinyModel40/100

via “vision transformer-based object detection with attention-weighted region proposals”

object-detection model by undefined. 83,525 downloads.

Unique: Applies pure transformer architecture (DETR-style with learnable object queries) to object detection instead of CNN backbones, enabling attention-based spatial reasoning without region proposal networks; tiny variant achieves 5.4M parameters through aggressive model compression while maintaining COCO detection capability

vs others: Simpler architecture than Faster R-CNN (no RPN) and more parameter-efficient than standard ViT detectors, but slower inference than optimized YOLO v5/v8 on edge devices due to transformer computational overhead

7

paper2guiWeb App39/100

via “real-time object detection with yolo models”

Convert AI papers to GUI，Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术

Unique: Implements multiple YOLO model variants (v5, v6, YOLOX) through NCNN with Vulkan GPU acceleration, allowing model selection based on accuracy/speed tradeoff; includes configurable confidence thresholds and NMS parameters for detection filtering; supports JSON output for programmatic integration

vs others: Faster inference than PyTorch-based YOLO implementations (NCNN optimization); standalone executable vs Python-based tools; supports multiple model variants vs single-model tools; local processing vs cloud APIs (no latency, no privacy concerns)

8

Anzhcs_YOLOsModel39/100

via “real-time multi-class object detection with bounding box localization”

object-detection model by undefined. 86,897 downloads.

Unique: Fine-tuned variant of Ultralytics YOLO11 base model specialized for art-domain object detection, inheriting YOLO11's architectural improvements (anchor-free detection, decoupled head design) while maintaining single-stage detection efficiency. Uses Ultralytics' native PyTorch implementation with built-in export support for ONNX, TensorRT, and CoreML for cross-platform deployment.

vs others: Faster inference than Faster R-CNN or Mask R-CNN (single-stage vs two-stage detection) with better art-domain accuracy than generic COCO-trained YOLOv8 due to fine-tuning on specialized data; lighter than Vision Transformers while maintaining competitive accuracy.

9

yolov5m-license-plateModel39/100

via “real-time license plate detection in images”

object-detection model by undefined. 46,896 downloads.

Unique: YOLOv5m architecture with medium-weight backbone (vs YOLOv5s for speed or YOLOv5l for accuracy) trained specifically on keremberke's license-plate dataset, balancing inference latency (~30-50ms on GPU) with detection precision for automotive use cases. Uses CSPDarknet backbone with PANet neck for multi-scale feature fusion, enabling detection of plates across varying distances and image resolutions.

vs others: Faster inference than Faster R-CNN or Mask R-CNN variants (single-stage vs two-stage detection) while maintaining competitive mAP on license plate datasets; more specialized than generic COCO-trained YOLOv5 models due to domain-specific fine-tuning on automotive plate imagery.

10

rtdetr_r101vd_coco_o365Model39/100

via “real-time object detection with transformer-based architecture”

object-detection model by undefined. 1,21,720 downloads.

Unique: Uses transformer encoder-decoder architecture with direct set prediction (eliminating anchor boxes and NMS) combined with ResNet-101-VD backbone, achieving real-time performance through efficient attention mechanisms and hybrid CNN-transformer design that balances speed and accuracy across 365 object categories from Objects365 dataset

vs others: Faster than traditional Faster R-CNN/Mask R-CNN detectors (50-100ms vs 200-400ms) while maintaining higher accuracy than lightweight YOLO variants through transformer attention, and more practical for production than ViT-based detectors due to optimized backbone selection

11

rtdetr_r50vd_coco_o365Model38/100

via “real-time object detection with transformer-based architecture”

object-detection model by undefined. 80,830 downloads.

Unique: Uses transformer encoder-decoder architecture with deformable attention mechanisms instead of traditional CNN-based region proposal networks; eliminates anchor boxes and NMS post-processing, reducing inference pipeline complexity while maintaining real-time performance through efficient attention computation

vs others: Faster inference than Faster R-CNN (no RPN overhead) and simpler than YOLO (no anchor engineering), while maintaining transformer-based reasoning for improved generalization across diverse object scales and aspect ratios

12

yolov11-license-plate-detectionModel38/100

via “real-time license plate localization in images”

object-detection model by undefined. 26,512 downloads.

Unique: YOLOv11 architecture uses decoupled detection heads and anchor-free design with dynamic label assignment, enabling faster convergence on specialized license plate domain compared to anchor-based detectors; fine-tuned specifically on Roboflow's license plate dataset rather than generic COCO weights

vs others: Faster inference than Faster R-CNN or SSD variants while maintaining comparable accuracy; more specialized than generic YOLOv8 due to domain-specific fine-tuning on license plate data

13

YOLO LabelingExtension34/100

via “multi-format yolo annotation format support (detection, segmentation, pose, obb)”

A VS Code extension for YOLO dataset labeling

Unique: Single extension handles 6+ YOLO annotation formats (detection, segmentation, pose, OBB) with format-specific rendering logic, whereas most tools specialize in one task type — enables unified workflow across YOLO model variants

vs others: More versatile than single-task tools like LabelImg (detection-only), but less specialized than task-specific tools like OpenLabeling (detection) or CVAT (multi-task with more features)

14

ultralyticsFramework32/100

via “real-time-object-tracking-with-multi-algorithm-support”

Ultralytics YOLO 🚀 for SOTA object detection, multi-object tracking, instance segmentation, pose estimation and image classification.

Unique: Integrates tracking as a post-processing step on detection results rather than as a separate model, allowing any YOLO detection variant to be paired with any tracking algorithm, with tracker state managed internally by the YOLO model instance

vs others: Simpler than standalone trackers (DeepSORT, Kalman filter implementations) because tracking is built into the predict() pipeline, and more flexible than detection-only models because users can choose tracking algorithm without retraining

15

ImageSorcery MCPMCP Server28/100

via “yolo-based object detection with bounding box extraction”

** - ComputerVision-based 🪄 sorcery of image recognition and editing tools for AI assistants.

Unique: Runs YOLO inference locally within the MCP server process rather than calling cloud vision APIs, with automatic model provisioning via post_install.py that downloads and caches weights, enabling AI assistants to perform object detection without external API calls or data transmission

vs others: Faster than cloud-based vision APIs (no network latency) and more private than Google Vision or AWS Rekognition, but requires local GPU/CPU resources and manual model management vs fully managed cloud services

16

You Only Look Once: Unified, Real-Time Object Detection (YOLO)Product22/100

via “single-pass unified object detection with spatial grid regression”

* 🏆 2017: [Attention is All you Need (Transformer)](https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html)

Unique: Pioneered the single-stage detection paradigm by formulating object detection as a direct spatial regression problem on a grid, eliminating the region proposal generation stage (RPN) used by two-stage detectors. Uses a unified loss function jointly optimizing bounding box regression (L2 loss) and class prediction (cross-entropy) across all grid cells in a single forward pass through a fully-convolutional architecture.

vs others: 45-155 FPS inference speed (vs 7 FPS for Faster R-CNN) with comparable accuracy, enabling real-time video processing on single GPUs; architectural simplicity makes it 10x faster to train than region proposal methods while maintaining end-to-end differentiability.

17

Frigate NVRProduct

via “real-time object detection and classification”

18

Voxel51Product

via “real-time video object detection and tracking”

Top Matches

Also Known As

Company