Loading...

nsfw-image-detection-384

ModelFree

image-classification model by undefined. 65,60,925 downloads.

Open Source

49

/ 100

5 capabilities

Capabilities5 decomposed

nsfw content classification via vision transformer embeddings

Medium confidence

Classifies images as safe or unsafe for work using a timm-based vision transformer backbone (384-dimensional embedding space) fine-tuned on NSFW/SFW datasets. The model encodes images into a learned embedding space where unsafe content clusters distinctly from safe content, enabling binary or multi-class classification through a trained classification head. Uses safetensors format for efficient model serialization and loading.

Solves for

Filter user-generated content in moderation pipelines before publishingAutomatically tag or quarantine potentially unsafe images in bulk uploadsBuild content safety checks into image hosting or social platformsScreen images in e-commerce or community platforms for policy violations

Best for

Content moderation teams building automated safety systems

Platform engineers implementing real-time image filtering

Developers building community-driven applications with UGC

Requires

Python 3.7+

PyTorch 1.9+ or compatible deep learning framework

timm library (PyTorch Image Models) for vision transformer backbone

Limitations

Binary or limited-class classification only — does not distinguish between types of unsafe content (violence, explicit, etc.)

384-dimensional embedding space may not capture nuanced edge cases or cultural context variations

Inference latency depends on hardware; GPU acceleration recommended for production throughput

What makes it unique

Uses timm vision transformer backbone with 384-dimensional embedding space (vs. ResNet-50 or EfficientNet baselines), enabling efficient batch inference and downstream embedding-space operations like clustering or similarity search. Serialized in safetensors format for faster, safer model loading compared to pickle-based PyTorch checkpoints.

vs alternatives

Faster inference than proprietary APIs (Perspective API, AWS Rekognition) due to local execution, and more transparent than black-box commercial models, though may require fine-tuning for domain-specific content policies.

batch image safety screening with embedding extraction

Medium confidence

Processes multiple images in parallel, extracting both classification predictions and 384-dimensional embeddings for each image in a single forward pass. Supports batching via PyTorch DataLoader or manual batch stacking, enabling efficient throughput for large-scale content moderation workflows. Embeddings can be persisted to vector databases for downstream similarity-based filtering or clustering of unsafe content patterns.

Solves for

Screen thousands of user uploads in a single batch job for moderationExtract embeddings for all images in a dataset to build a safety-aware vector indexIdentify clusters of similar unsafe content for pattern analysisBuild a content similarity search system to find related unsafe images

Best for

Data engineers building batch moderation pipelines

ML teams analyzing content safety patterns at scale

Developers integrating safety checks into ETL workflows

Requires

Python 3.7+

PyTorch with CUDA support (for GPU acceleration) or CPU fallback

Sufficient RAM for batch size × image resolution (e.g., 32 images × 384×384 ≈ 2GB)

Limitations

Batch processing requires loading all images into memory — memory constraints limit batch size on consumer hardware

No streaming or incremental inference — entire batch must complete before results are available

Embeddings are fixed 384-dimensional vectors — cannot be dynamically resized for different downstream tasks

What makes it unique

Extracts both classification predictions and embeddings in a single forward pass, allowing downstream vector-space operations (clustering, similarity search) without re-running inference. Supports arbitrary batch sizes via PyTorch's flexible tensor operations, enabling memory-efficient processing on constrained hardware.

vs alternatives

More efficient than calling per-image classification APIs (e.g., AWS Rekognition) for large batches, and provides embeddings for free, enabling downstream similarity-based filtering that proprietary APIs charge separately for.

real-time image safety inference with low-latency prediction

Medium confidence

Performs single-image NSFW classification with minimal latency suitable for synchronous request-response workflows (e.g., API endpoints, chat applications). Uses optimized inference paths via ONNX export or TorchScript compilation to reduce overhead. Can be deployed as a microservice or embedded in application servers for immediate safety feedback on user uploads.

Solves for

Reject unsafe images immediately during upload in a web applicationProvide real-time safety feedback to users before they post contentBlock unsafe images in chat or messaging applications before deliveryImplement safety gates in API endpoints with <100ms latency requirements

Best for

Full-stack developers building user-facing content platforms

Backend engineers implementing API safety gates

DevOps teams deploying inference microservices

Requires

Python 3.7+ or compiled inference runtime (ONNX Runtime, TensorRT)

GPU with CUDA 11.0+ (recommended for <200ms latency) or CPU fallback

Web framework (FastAPI, Flask, Django) for API wrapping

Limitations

Single-image inference only — no batch optimization benefits

Latency varies with hardware (CPU: 500ms–2s, GPU: 50–200ms) — requires GPU for production SLAs

No caching of results — identical images processed redundantly

What makes it unique

Optimized for single-image inference with minimal preprocessing overhead. Can be compiled to ONNX or TorchScript for deployment on CPU-only or edge devices without Python runtime, enabling sub-100ms latency on modern GPUs.

vs alternatives

Faster than cloud-based moderation APIs (Perspective, AWS Rekognition) due to local execution and no network round-trip, and more cost-effective for high-volume inference since there are no per-request charges.

transfer learning fine-tuning for domain-specific nsfw detection

Medium confidence

Leverages the pre-trained vision transformer backbone and 384-dimensional embedding space as a feature extractor for custom NSFW classification tasks. Enables fine-tuning on domain-specific datasets (e.g., medical imagery, artwork, anime) by replacing or retraining the classification head while freezing or partially unfreezing the backbone. Uses standard PyTorch training loops with cross-entropy loss and gradient descent optimization.

Solves for

Adapt the model to detect unsafe content specific to your platform or industryReduce false positives on artwork, medical images, or other edge casesTrain on proprietary datasets without sharing data with external APIsBuild multiple specialized classifiers (e.g., violence vs. explicit) from shared embeddings

Best for

ML engineers with labeled domain-specific datasets (100+ examples)

Teams needing custom safety policies beyond generic NSFW/SFW

Organizations with privacy constraints preventing cloud-based APIs

Requires

Python 3.7+

PyTorch 1.9+

Labeled dataset with 100+ images per class (minimum)

Limitations

Requires labeled training data — no zero-shot or few-shot learning without additional techniques

Fine-tuning on small datasets (<500 examples) risks overfitting — requires regularization and validation splits

Backbone weights are frozen by default — full model fine-tuning requires significant compute and data

What makes it unique

Provides a pre-trained 384-dimensional embedding space that captures generic NSFW patterns, enabling efficient transfer learning with smaller labeled datasets. Supports both linear probe (frozen backbone) and full fine-tuning strategies, allowing trade-offs between data efficiency and model capacity.

vs alternatives

More data-efficient than training from scratch due to pre-trained backbone, and more flexible than proprietary APIs which cannot be customized for domain-specific policies or edge cases.

embedding-space similarity search for unsafe content clustering

Medium confidence

Extracts 384-dimensional embeddings for images and enables vector similarity search to find visually similar unsafe content. Embeddings can be indexed in vector databases (Pinecone, Weaviate, Milvus) or used with approximate nearest neighbor (ANN) algorithms (FAISS, Annoy) for fast retrieval. Enables clustering of unsafe content patterns without re-running classification on every image.

Solves for

Find all images similar to a flagged unsafe image in a large datasetCluster unsafe content to identify coordinated abuse or spam campaignsBuild a content deduplication system to avoid processing duplicate unsafe imagesAnalyze patterns in unsafe content to improve moderation policies

Best for

Content moderation teams analyzing abuse patterns

Platform engineers building deduplication systems

Researchers studying content distribution and clustering

Requires

Python 3.7+

Vector database (Pinecone, Weaviate, Milvus) or local ANN library (FAISS, Annoy)

Sufficient storage for embeddings (1.5KB per image × dataset size)

Limitations

Similarity search is approximate — may miss visually similar images if embedding space is not well-calibrated

Requires pre-computing and storing embeddings for all images — significant storage overhead (384 floats × num_images ≈ 1.5KB per image)

Vector database setup and maintenance adds operational complexity

What makes it unique

Leverages the 384-dimensional embedding space to enable efficient similarity search without re-running classification. Supports both local ANN algorithms (FAISS) and managed vector databases, enabling scalability from small datasets to billions of images.

vs alternatives

More efficient than image hashing (perceptual hashing) for semantic similarity, and more scalable than pairwise image comparison for large datasets. Enables downstream clustering and pattern analysis that simple classification cannot provide.

Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.

Related Artifactssharing capabilities

Artifacts that share capabilities with nsfw-image-detection-384, ranked by overlap. Discovered automatically through the match graph.

nsfw_image_detector

image-classification model by undefined. 9,43,400 downloads.

nsfw content classification via vision transformervision transformer-based feature extraction for nsfw embeddingsbatch image inference with safetensors format

3 shared capabilities

nsfw_image_detection

image-classification model by undefined. 3,40,24,086 downloads.

binary-nsfw-image-classificationvision-transformer-feature-extraction

2 shared capabilities

vit-base-nsfw-detector

image-classification model by undefined. 11,33,319 downloads.

vision transformer-based nsfw image classificationcross-platform model inference with transformers.js browser support

2 shared capabilities

rorshark-vit-base

image-classification model by undefined. 6,20,550 downloads.

vision transformer-based image classification with imagenet-21k pretrainingattention-based feature extraction for downstream tasks

2 shared capabilities

Meta: Llama Guard 4 12B

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

image safety classification with visual understanding

1 shared capability

Qwen: Qwen3 VL 235B A22B Thinking

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STEM and math....

visual content moderation and safety classification

1 shared capability

Best For

✓Content moderation teams building automated safety systems
✓Platform engineers implementing real-time image filtering
✓Developers building community-driven applications with UGC
✓Teams needing open-source alternatives to proprietary moderation APIs
✓Data engineers building batch moderation pipelines
✓ML teams analyzing content safety patterns at scale
✓Developers integrating safety checks into ETL workflows
✓Researchers studying NSFW content distribution and clustering

Known Limitations

⚠Binary or limited-class classification only — does not distinguish between types of unsafe content (violence, explicit, etc.)
⚠384-dimensional embedding space may not capture nuanced edge cases or cultural context variations
⚠Inference latency depends on hardware; GPU acceleration recommended for production throughput
⚠Model trained on specific NSFW/SFW datasets — performance may degrade on out-of-distribution image styles (e.g., artwork, anime, medical imagery)
⚠No built-in confidence thresholding or uncertainty quantification — requires external calibration for production deployment
⚠Batch processing requires loading all images into memory — memory constraints limit batch size on consumer hardware

Requirements

Python 3.7+PyTorch 1.9+ or compatible deep learning frameworktimm library (PyTorch Image Models) for vision transformer backbonetransformers library for model loading and inferenceHugging Face Hub access (for model download via huggingface_hub)Sufficient VRAM for 384-dim embedding inference (~2-4GB for batch processing)PyTorch with CUDA support (for GPU acceleration) or CPU fallbackSufficient RAM for batch size × image resolution (e.g., 32 images × 384×384 ≈ 2GB)

Input / Output

Accepts: JPEG images, PNG images, WebP images, Tensor arrays (pre-loaded in memory), Batch of JPEG/PNG/WebP images, Image file paths (loaded via PIL or OpenCV), Pre-loaded image tensors (shape: [batch_size, 3, 384, 384]), Single JPEG/PNG/WebP image, Image URL (fetched and decoded on-the-fly), Base64-encoded image string, Pre-loaded PIL Image or torch.Tensor, Labeled image dataset (directory structure or CSV manifest), Pre-computed embeddings (for linear probe fine-tuning), Augmented images (via torchvision.transforms or albumentations), Query image (converted to 384-dim embedding), Pre-computed embedding vectors (batch or single), Embedding indices in vector database

Produces: Binary classification label (safe/unsafe), Classification probabilities (softmax scores per class), 384-dimensional embedding vector (for downstream similarity search or clustering), Classification labels per image (batch_size,), Classification probabilities (batch_size, num_classes), Embedding matrix (batch_size, 384), Classification probability (confidence score 0–1), Optional: 384-dimensional embedding for downstream analysis, Fine-tuned model checkpoint (PyTorch .pt or safetensors format), Training metrics (loss, accuracy, F1 score), Validation predictions and confusion matrices, List of similar image IDs with similarity scores, Clustering assignments (via k-means or DBSCAN on embeddings), Similarity matrix (pairwise distances between embeddings)

UnfragileRank

Adoption82%(40% weight)

Quality21%(20% weight)

Ecosystem45%(15% weight)

Match Graph10%(20% weight)

Freshness75%(5% weight)

UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.

Type: Model

5 capabilities

Visit nsfw-image-detection-384→

Model Details

huggingface

Provider

timm

Architecture

6,560,925

Downloads

Tasks

image-classification

About

Marqo/nsfw-image-detection-384 — a image-classification model on HuggingFace with 65,60,925 downloads

Categories

image-generationtimmsafetensorsimage-classificationlicense:apache-2.0region:us

Alternatives to nsfw-image-detection-384

Dreambooth-Stable-Diffusion45Repository

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

sdnext51Repository

SD.Next: All-in-one WebUI for AI generative image and video creation, captioning and processing

fast-stable-diffusion48Repository

fast-stable-diffusion + DreamBooth

ai-notes37Prompt

notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under the /Resources folder.

Are you the builder of nsfw-image-detection-384?

Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.

Claim this artifact →Verification via email

Get the weekly brief

New tools, rising stars, and what's actually worth your time. No spam.

Data Sources

huggingface

Looking for something else?