Capability
Lightweight Mask Decoder With Iterative Refinement Loops
5 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
Meta's foundation model for visual segmentation.
Unique: Uses a lightweight transformer decoder with iterative refinement where each iteration re-attends to image features and the previous mask prediction, enabling convergence to accurate masks without increasing model size. This design trades off multiple forward passes for reduced model parameters.
vs others: More efficient than heavy decoders (e.g., FPN + RPN in Mask R-CNN) because it avoids region proposal generation and uses attention-based refinement, reducing inference latency by 5-10x while maintaining comparable accuracy.