Ade20k 150 Class Semantic Prediction

1

segformer-b0-finetuned-ade-512-512Fine-tune46/100

via “ade20k-scene-category-prediction-with-class-mapping”

image-segmentation model by undefined. 3,13,332 downloads.

Unique: Provides direct mapping to 150 ADE20K scene categories with official color palette and hierarchical groupings, enabling interpretable scene understanding without post-hoc label engineering — most generic segmentation models require manual class mapping and visualization setup

vs others: Pre-trained on diverse indoor/outdoor scenes (ADE20K) with comprehensive 150-class taxonomy covering furniture, building parts, and natural elements, providing richer scene understanding than generic COCO panoptic segmentation (80 classes) or Cityscapes (19 classes) which focus on specific domains

2

oneformer_ade20k_swin_tinyModel45/100

via “ade20k-scene-parsing-with-150-class-taxonomy”

image-segmentation model by undefined. 2,48,429 downloads.

Unique: Trained specifically on ADE20K's 150-class taxonomy with dense pixel-level annotations for indoor scenes, providing fine-grained scene understanding (room types, furniture, architectural elements) that general-purpose segmentation models (e.g., COCO-trained models with 80 classes) cannot match. Achieves 48.5% mIoU on ADE20K validation set through task-conditioned learning.

vs others: Achieves higher accuracy on ADE20K benchmarks than task-specific models (e.g., Mask2Former, DeepLabV3+) due to unified task learning; provides 150 semantic classes vs 80 for COCO-trained models, enabling richer scene understanding for indoor applications.

3

oneformer_ade20k_swin_largeModel44/100

via “ade20k-150-class-semantic-prediction”

image-segmentation model by undefined. 90,906 downloads.

Unique: Trained on ADE20K's diverse 150-class taxonomy covering both stuff (wall, sky, floor) and things (person, car, furniture) with class-balanced sampling during training. Uses learned class embeddings (150×256) that are matched against pixel features via dot-product attention, enabling efficient per-pixel classification.

vs others: Achieves 48.9 mIoU on ADE20K validation set, outperforming DeepLabV3+ (46.2 mIoU) and comparable to Mask2Former (48.7 mIoU) while using a unified architecture. However, task-specific semantic segmentation models (e.g., SegFormer) can achieve 50+ mIoU if not constrained to multi-task design.

4

segformer-b0-finetuned-ade-512-512Fine-tune44/100

via “ade20k-scene-class-prediction-with-150-categories”

image-segmentation model by undefined. 5,08,692 downloads.

Unique: Integrates ADE20K's 150-class ontology with hierarchical scene understanding — classes are organized by spatial context (indoor vs outdoor, furniture vs architecture) enabling downstream filtering and reasoning without custom label mapping

vs others: More granular than COCO segmentation (80 classes) for indoor scene understanding, and includes scene-context labels (wall, floor, ceiling) that generic object detectors omit

5

mask2former-swin-large-ade-semanticModel44/100

via “ade20k 150-class semantic taxonomy mapping”

image-segmentation model by undefined. 1,19,949 downloads.

Unique: Leverages ADE20K's diverse 150-class taxonomy that balances thing and stuff classes, enabling both instance-level and semantic-level understanding in a single model. Unlike COCO (80 classes, mostly things) or Cityscapes (19 classes, driving-focused), ADE20K covers diverse indoor/outdoor scenes with fine-grained distinctions.

vs others: ADE20K taxonomy provides 2-3x more semantic granularity than Cityscapes for indoor scenes and 1.5-2x more than COCO for stuff classes, enabling richer scene understanding at the cost of lower per-class accuracy on common categories like 'person' or 'car'.

6

segformer-b5-finetuned-ade-640-640Fine-tune43/100

via “ade20k-scene-class-prediction-with-150-categories”

image-segmentation model by undefined. 61,096 downloads.

Unique: Trained on ADE20K's 150 semantic classes with class-balanced loss weighting to handle imbalanced category distributions, enabling reasonable performance even on rare scene elements. Decoder architecture uses lightweight MLP layers (vs dense convolutions) to map transformer features to 150 logits efficiently, achieving state-of-the-art mIoU on ADE20K benchmark.

vs others: More comprehensive scene understanding than Cityscapes (19 classes, urban-only) or Pascal VOC (21 classes) due to ADE20K's diverse indoor/outdoor vocabulary; more accurate than generic semantic segmentation models (FCN, U-Net) because fine-tuned specifically for scene parsing task; less specialized than domain-specific models (medical segmentation, satellite imagery) but more generalizable.

7

segformer-b1-finetuned-ade-512-512Fine-tune43/100

via “ade20k-150-class-semantic-taxonomy-prediction”

image-segmentation model by undefined. 1,77,465 downloads.

Unique: Trained on ADE20K's hierarchical scene taxonomy (150 fine-grained classes) rather than generic COCO or Cityscapes, capturing scene-specific semantics like 'wall', 'ceiling', 'floor', and furniture types. Optimized for indoor/outdoor scene understanding rather than autonomous driving or panoptic segmentation.

vs others: Richer semantic granularity than Cityscapes (19 classes) for scene understanding; more scene-focused than COCO panoptic segmentation; better suited for interior robotics and spatial understanding than generic object detectors.

8

segformer-b4-finetuned-ade-512-512Fine-tune42/100

via “ade20k-scene-parsing-with-150-semantic-classes”

image-segmentation model by undefined. 1,04,510 downloads.

Unique: Fine-tuned specifically on ADE20K's 150-class taxonomy covering both common and rare scene elements, achieving 50.3% mIoU through domain-specific optimization. Unlike generic segmentation models (COCO, Cityscapes), this model prioritizes scene understanding over object detection, with classes representing spatial regions and architectural elements rather than discrete objects.

vs others: Achieves 8-12% higher mIoU on ADE20K than Cityscapes-trained models and 15-20% higher than COCO-trained models due to domain-specific fine-tuning, making it the standard choice for scene parsing benchmarks.

9

segformer-b2-finetuned-ade-512-512Fine-tune41/100

via “ade20k-scene-category-classification-with-150-classes”

image-segmentation model by undefined. 63,104 downloads.

Unique: Trained on ADE20K's 150-class taxonomy which includes fine-grained scene elements (architectural details, furniture types, vegetation species) rather than generic object categories — enables detailed scene understanding beyond basic object detection. Hierarchical class structure allows both coarse (e.g., 'furniture') and fine-grained (e.g., 'chair', 'table') predictions.

vs others: More comprehensive scene understanding than COCO-panoptic (80 classes) or Cityscapes (19 classes) for indoor/outdoor scenes, but less specialized than domain-specific models (medical, satellite) — best for general-purpose scene parsing.

Top Matches

Also Known As

Company