{"passport":{"unfragile":{"@version":"1.0","version":"2026-05","artifact":{"id":"hn-47234566","slug":"demucs-music-stem-separator-rewritten-in-rust-runs","name":"Demucs music stem separator rewritten in Rust – runs in the browser","type":"repo","url":"https://github.com/nikhilunni/demucs-rs","page_url":"https://unfragile.ai/demucs-music-stem-separator-rewritten-in-rust-runs","categories":["automation"],"tags":["hackernews","show-hn"],"pricing":{"model":"open_source","free":true,"starting_price":null},"status":"active","verified":false},"capabilities":[{"id":"hn-47234566__cap_0","uri":"capability://data.processing.analysis.browser.native.audio.stem.separation.with.onnx.inference","name":"browser-native audio stem separation with onnx inference","description":"Executes the Demucs neural network model (vocals, drums, bass, other) directly in the browser using ONNX Runtime WebAssembly, eliminating server-side processing. The Rust codebase compiles to WebAssembly via wasm-bindgen, exposing a JavaScript API that loads pre-trained model weights and runs inference on client-side audio buffers without network latency or privacy concerns.","intents":["separate music into individual stems (vocals, drums, bass, other) without uploading to a server","build a web app that processes audio locally without backend infrastructure","integrate stem separation into a browser-based music production workflow","reduce latency and privacy risks by keeping audio processing on-device"],"best_for":["web developers building music production tools","teams building privacy-first audio applications","indie music producers wanting client-side stem separation","developers prototyping audio ML features without server costs"],"limitations":["ONNX Runtime WebAssembly has higher memory overhead than native inference; typical 3-5 minute songs require 2-4GB RAM during processing","inference speed depends on client CPU/GPU; no GPU acceleration in most browsers (WebGPU still experimental)","model weights must be downloaded to client (typically 200-500MB for full Demucs model); no streaming model loading","browser tab may become unresponsive during long inference; no built-in progress reporting or cancellation","limited to browsers with WebAssembly support (IE11 not supported)"],"requires":["modern browser with WebAssembly support (Chrome 57+, Firefox 52+, Safari 11+)","sufficient client RAM (minimum 2GB for typical songs)","ONNX model files pre-converted and hosted (Demucs v3 or v4 models)","JavaScript runtime to load wasm module and call inference API"],"input_types":["audio/wav","audio/mp3","audio/webm","raw PCM audio buffers (Float32Array)"],"output_types":["audio/wav (separate stem files for vocals, drums, bass, other)","raw PCM audio buffers (Float32Array per stem)"],"categories":["data-processing-analysis","audio-processing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hn-47234566__cap_1","uri":"capability://data.processing.analysis.real.time.audio.buffer.streaming.and.windowing","name":"real-time audio buffer streaming and windowing","description":"Handles chunked audio input processing by managing sliding windows of audio frames, buffering partial chunks, and coordinating inference timing to avoid gaps or overlaps in stem output. The Rust implementation uses ring buffers or deque structures to queue incoming audio data and emit inference-ready chunks at the model's required sample rate and frame size, with overlap-add reconstruction for seamless stem reconstruction.","intents":["process audio in real-time chunks rather than requiring the entire file upfront","stream audio from a file or microphone input into the stem separator","avoid memory spikes by processing audio in fixed-size windows","reconstruct seamless output stems from overlapping inference windows"],"best_for":["developers building real-time audio processing pipelines","applications that need to process long audio files without loading entirely into memory","live music production tools that process microphone input"],"limitations":["overlap-add reconstruction introduces latency proportional to window size; typical 2-5 second delay before first stem output","requires careful synchronization between input buffering and inference scheduling; race conditions can cause audio artifacts","no built-in handling for variable sample rates; input must be resampled to model's expected rate (typically 44.1kHz or 48kHz) before buffering","window size is fixed by model architecture; cannot dynamically adjust for low-latency use cases"],"requires":["audio source providing samples at consistent sample rate","buffer size matching model's expected input dimensions (typically 16384 or 32768 samples)","resampler library if input sample rate differs from model's expected rate"],"input_types":["streaming audio buffers (Float32Array chunks)","audio file paths or Blob objects"],"output_types":["streaming stem buffers (Float32Array chunks per stem)","complete stem audio files after processing finishes"],"categories":["data-processing-analysis","audio-processing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hn-47234566__cap_2","uri":"capability://memory.knowledge.onnx.model.weight.loading.and.caching","name":"onnx model weight loading and caching","description":"Loads pre-trained Demucs model weights from ONNX format files and caches them in browser memory or IndexedDB to avoid re-downloading on subsequent uses. The implementation handles model initialization, weight tensor mapping to the inference graph, and optional persistent storage using browser APIs, with fallback to re-download if cache is unavailable.","intents":["avoid re-downloading large model files (200-500MB) on every page load","persist model weights across browser sessions using IndexedDB","initialize the inference engine with pre-trained weights on first load","support multiple model variants (Demucs v3, v4, different architectures) with separate caches"],"best_for":["web apps where users return multiple times and want fast startup","applications with bandwidth constraints or slow network connections","teams building music production tools with offline-first requirements"],"limitations":["IndexedDB storage quota varies by browser (typically 50MB-1GB); large models may exceed quota on mobile devices","no built-in versioning; updating model weights requires cache invalidation logic in application code","model loading blocks inference; no lazy loading of model layers","ONNX model files must be pre-converted and hosted; no on-the-fly conversion from PyTorch checkpoints"],"requires":["ONNX model files hosted on CDN or same-origin server","browser support for IndexedDB (all modern browsers)","sufficient disk quota in browser storage (varies by device)"],"input_types":["ONNX model files (.onnx format)","model metadata (input/output shapes, expected sample rate)"],"output_types":["loaded model state in WASM memory","cached model weights in IndexedDB"],"categories":["memory-knowledge","data-processing-analysis"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hn-47234566__cap_3","uri":"capability://data.processing.analysis.multi.stem.parallel.inference.orchestration","name":"multi-stem parallel inference orchestration","description":"Coordinates inference across multiple output stems (vocals, drums, bass, other) by running the Demucs model once per stem or using a multi-output model variant that produces all stems in a single forward pass. The Rust implementation manages tensor allocation, inference scheduling, and output collection to ensure all stems are computed and synchronized before returning results to the caller.","intents":["extract all four stems (vocals, drums, bass, other) from a single audio input","run inference efficiently by batching stem computation or using multi-output models","ensure all stems are synchronized and have matching sample counts","return stems in a structured format (e.g., object with vocal, drum, bass, other keys)"],"best_for":["music production applications that need all stems simultaneously","batch processing pipelines that separate multiple songs","applications where stem synchronization is critical (e.g., mixing tools)"],"limitations":["inference time scales linearly with number of stems if using single-output model; no parallelization across stems in browser","memory usage multiplies with number of stems; storing all four stems in memory simultaneously requires 4x the audio buffer size","no built-in stem selection; must compute all stems even if only one is needed","output stem quality depends on model training; Demucs may produce artifacts in stems with overlapping frequency content (e.g., vocals and drums)"],"requires":["ONNX model supporting multi-stem output or separate models per stem","sufficient RAM to hold input audio + all output stems simultaneously","inference engine capable of running model multiple times or supporting batched inference"],"input_types":["single audio buffer (Float32Array) containing mixed audio"],"output_types":["structured object with separate Float32Array buffers for each stem (vocals, drums, bass, other)","or four separate audio files in WAV/MP3 format"],"categories":["data-processing-analysis","audio-processing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hn-47234566__cap_4","uri":"capability://data.processing.analysis.audio.format.conversion.and.resampling","name":"audio format conversion and resampling","description":"Converts input audio from various formats (MP3, WAV, WebM, OGG) to raw PCM buffers at the model's expected sample rate, handling codec decoding and sample rate conversion transparently. The implementation uses browser Web Audio API for decoding and Rust-based resampling (e.g., sinc interpolation or linear interpolation) to match the model's input requirements without requiring external libraries.","intents":["accept audio files in multiple formats without requiring users to pre-convert","automatically resample audio to match the model's expected sample rate (e.g., 44.1kHz)","handle mono and stereo audio, converting to the model's expected channel count","normalize audio levels to prevent clipping or silent output"],"best_for":["web apps that accept user-uploaded audio files in arbitrary formats","music production tools that need to handle diverse audio sources","applications where audio preprocessing should be transparent to users"],"limitations":["resampling quality depends on algorithm; linear interpolation introduces aliasing artifacts; sinc interpolation is slower but higher quality","Web Audio API decoding is asynchronous and blocks the main thread; large files may cause UI freezing","no support for lossless formats like FLAC in all browsers; MP3 decoding may introduce artifacts","mono-to-stereo or stereo-to-mono conversion is lossy; stereo downmix may lose spatial information"],"requires":["Web Audio API support in browser (all modern browsers)","audio codec support for input format (browser-dependent; MP3 and WAV widely supported)","resampling library or algorithm (can be implemented in Rust or delegated to Web Audio API)"],"input_types":["audio/mpeg (MP3)","audio/wav (WAV)","audio/webm","audio/ogg","audio/flac (limited browser support)","Blob or File objects from file input"],"output_types":["raw PCM audio buffers (Float32Array) at model's expected sample rate","mono or stereo depending on model requirements"],"categories":["data-processing-analysis","audio-processing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hn-47234566__cap_5","uri":"capability://data.processing.analysis.stem.output.export.to.audio.files","name":"stem output export to audio files","description":"Encodes separated stems from raw PCM buffers into downloadable audio files (WAV, MP3, or other formats) with metadata (sample rate, bit depth, channel count). The implementation uses browser APIs or Rust-based encoders to convert Float32Array buffers to file formats, handling byte ordering, header generation, and optional compression.","intents":["export separated stems as downloadable audio files for use in DAWs or other tools","save stems in multiple formats (WAV for lossless, MP3 for smaller file size)","preserve audio quality and metadata during export","batch export multiple stems with consistent naming and formatting"],"best_for":["music production applications where users need to download stems","batch processing pipelines that generate stem files","applications integrating with DAWs or other audio software"],"limitations":["WAV encoding is lossless but produces large files (typically 50-100MB per stem for 3-5 minute songs)","MP3 encoding requires external library or browser API; quality depends on bitrate (typically 128-320kbps)","no built-in metadata tagging (ID3 for MP3, RIFF INFO for WAV); requires additional library for full metadata support","file download is limited by browser security; cannot directly write to user's filesystem without download dialog"],"requires":["browser File API and Blob support for file generation","audio encoder library (can be Rust-based or JavaScript-based)","sufficient disk space for generated files"],"input_types":["raw PCM audio buffers (Float32Array) for each stem","metadata (sample rate, bit depth, channel count)"],"output_types":["audio/wav files (WAV format with PCM encoding)","audio/mpeg files (MP3 format with lossy compression)","Blob objects ready for download or further processing"],"categories":["data-processing-analysis","audio-processing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hn-47234566__cap_6","uri":"capability://automation.workflow.progress.reporting.and.cancellation.for.long.running.inference","name":"progress reporting and cancellation for long-running inference","description":"Exposes callbacks or event emitters that report inference progress (e.g., percentage complete, current stem being processed) and allow users to cancel ongoing inference. The implementation divides inference into checkpoints, emits progress events after each checkpoint, and checks for cancellation signals before proceeding to the next step.","intents":["show users progress during long inference operations (3-10 minutes for typical songs)","allow users to cancel inference if they change their mind or want to process a different file","provide feedback that the application is still responsive during inference","estimate time remaining based on progress rate"],"best_for":["web applications with long-running inference that need user feedback","music production tools where users may want to cancel and retry","applications with strict UX requirements for responsiveness"],"limitations":["progress granularity depends on inference checkpoints; may report progress in large jumps rather than smoothly","cancellation is not instantaneous; must wait for current inference step to complete before stopping","no built-in time estimation; requires tracking historical inference times to estimate remaining time","progress callbacks add overhead; frequent callbacks may slow inference slightly"],"requires":["inference engine that supports checkpoint-based execution or can be interrupted between steps","callback or event emitter mechanism to communicate progress to JavaScript caller","cancellation token or flag that inference loop checks regularly"],"input_types":["callback function or event listener for progress updates","cancellation token or abort signal"],"output_types":["progress events with percentage complete and current step","cancellation confirmation or error if cancelled"],"categories":["automation-workflow","user-experience"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hn-47234566__cap_7","uri":"capability://safety.moderation.error.handling.and.graceful.degradation.for.inference.failures","name":"error handling and graceful degradation for inference failures","description":"Catches inference errors (e.g., out-of-memory, invalid model, corrupted audio) and returns meaningful error messages to the caller, with optional fallback strategies (e.g., reduce audio quality, use smaller model variant). The implementation includes validation of input audio, model state checks, and error propagation through the JavaScript API.","intents":["provide clear error messages when inference fails (e.g., 'Out of memory, try a shorter audio file')","gracefully handle edge cases like very long audio files or unusual sample rates","suggest recovery strategies to users (e.g., reduce audio quality or use a smaller model)","prevent silent failures or cryptic WASM errors from reaching users"],"best_for":["production web applications where error handling is critical","applications with diverse user inputs and device capabilities","teams that need to support users with limited device resources"],"limitations":["error messages are only as good as the validation logic; some errors may not be caught until inference time","fallback strategies (e.g., smaller models) must be pre-implemented; no automatic model selection","out-of-memory errors may crash the browser tab; no recovery from OOM without reloading","error context is limited by WASM debugging capabilities; stack traces may be unhelpful"],"requires":["input validation logic for audio buffers and model state","error types and messages defined in Rust and exposed to JavaScript","optional fallback models or strategies pre-loaded"],"input_types":["audio buffers and model state","error handling configuration (e.g., fallback strategy)"],"output_types":["error objects with message, code, and optional recovery suggestions","fallback results if recovery strategy is applied"],"categories":["safety-moderation","automation-workflow"],"confidence":0.5,"matches":0,"success_rate":0}],"trust":{"score":33,"verified":false,"data_access_risk":"high","permissions":["modern browser with WebAssembly support (Chrome 57+, Firefox 52+, Safari 11+)","sufficient client RAM (minimum 2GB for typical songs)","ONNX model files pre-converted and hosted (Demucs v3 or v4 models)","JavaScript runtime to load wasm module and call inference API","audio source providing samples at consistent sample rate","buffer size matching model's expected input dimensions (typically 16384 or 32768 samples)","resampler library if input sample rate differs from model's expected rate","ONNX model files hosted on CDN or same-origin server","browser support for IndexedDB (all modern browsers)","sufficient disk quota in browser storage (varies by device)"],"failure_modes":["ONNX Runtime WebAssembly has higher memory overhead than native inference; typical 3-5 minute songs require 2-4GB RAM during processing","inference speed depends on client CPU/GPU; no GPU acceleration in most browsers (WebGPU still experimental)","model weights must be downloaded to client (typically 200-500MB for full Demucs model); no streaming model loading","browser tab may become unresponsive during long inference; no built-in progress reporting or cancellation","limited to browsers with WebAssembly support (IE11 not supported)","overlap-add reconstruction introduces latency proportional to window size; typical 2-5 second delay before first stem output","requires careful synchronization between input buffering and inference scheduling; race conditions can cause audio artifacts","no built-in handling for variable sample rates; input must be resampled to model's expected rate (typically 44.1kHz or 48kHz) before buffering","window size is fixed by model architecture; cannot dynamically adjust for low-latency use cases","IndexedDB storage quota varies by browser (typically 50MB-1GB); large models may exceed quota on mobile devices","builder identity is not verified yet","no observed match outcomes yet"],"rank_breakdown":{"adoption":0.36,"quality":0.26,"ecosystem":0.46,"match_graph":0.25,"freshness":0.6,"weights":{"adoption":0.3,"quality":0.2,"ecosystem":0.15,"match_graph":0.3,"freshness":0.05}},"observed_outcomes":{"matches":0,"success_rate":0,"avg_confidence":0,"top_intents":[],"last_matched_at":null},"maintenance":{"status":"active","updated_at":"2026-06-17T09:51:04.692Z","last_scraped_at":"2026-05-04T08:10:08.734Z","last_commit":null},"community":{"stars":null,"forks":null,"weekly_downloads":null,"model_downloads":null,"model_likes":null}},"distribution":{"claim_url":"https://unfragile.ai/submit?claim=demucs-music-stem-separator-rewritten-in-rust-runs","compare_url":"https://unfragile.ai/compare?artifact=demucs-music-stem-separator-rewritten-in-rust-runs"}},"signature":"AUR876GMLvpASy16HSjoGLPMebMZQIQrunXRbQ3G7B6Gw8XMq7z44uoiSCRphyhVm+pcH0B7/2NnQpxeF4CMAw==","signedAt":"2026-06-21T07:59:23.631Z","signedBy":"unfragile.ai","version":1},"_links":{"self":"https://unfragile.ai/api/v1/passport/demucs-music-stem-separator-rewritten-in-rust-runs","artifact":"https://unfragile.ai/demucs-music-stem-separator-rewritten-in-rust-runs","verify":"https://unfragile.ai/api/v1/verify?slug=demucs-music-stem-separator-rewritten-in-rust-runs","publicKey":"https://unfragile.ai/api/v1/trust-passport-public-key","spec":"https://unfragile.ai/trust","schema":"https://unfragile.ai/schema.json","docs":"https://unfragile.ai/docs"}}