oneformer_coco_swin_largeModel37/100 via “unified-image-segmentation-with-task-conditioning”
image-segmentation model by undefined. 79,337 downloads.
Unique: Uses a task-conditioned unified architecture with Swin Transformer backbone and learnable task tokens that route through a shared decoder, enabling dynamic task switching without model reloading. Unlike Mask2Former (task-specific) or DeepLab (single-task), OneFormer learns a shared representation space where task identity modulates the decoding pathway through cross-attention mechanisms.
vs others: Reduces deployment footprint by 66% compared to maintaining separate semantic/instance/panoptic models while achieving comparable accuracy, making it ideal for resource-constrained environments where model switching overhead is unacceptable.