Capability
Audio Feature Extraction With Configurable Representations
11 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “pretrained feature extraction for downstream speech tasks”
automatic-speech-recognition model by undefined. 23,46,228 downloads.
Unique: Exposes learned encoder representations from multi-domain VAD training as reusable features for downstream tasks; features are optimized for speech detection but transfer well to related speech understanding tasks through domain-invariant learning
vs others: Eliminates need to train feature extractors from scratch; leverages multi-domain pretraining for better generalization than task-specific feature extraction