browser-based live speech-to-text dictation
Captures real-time audio input from the user's microphone via the Web Audio API, streams it to a cloud-based transcription backend (engine provider unknown), and renders transcribed text into an in-browser notepad editor with minimal latency. The system handles automatic capitalization and supports voice commands for punctuation insertion, enabling hands-free note composition without installation or authentication.
Unique: Eliminates installation friction by running entirely in-browser with no registration required; users can begin dictating immediately on landing page. Combines Web Audio API for client-side capture with cloud transcription backend, avoiding the complexity of local speech models while maintaining instant accessibility.
vs alternatives: Faster time-to-first-value than Dragon NaturallySpeaking or Otter.ai (no download/signup), but trades accuracy and formatting intelligence for simplicity and zero-friction access.
audio and video file transcription with optional speaker diarization
Accepts uploaded audio files (MP3, WAV, etc.) and video files (MP4, etc.) via web form, sends them to a cloud transcription service for processing, and returns timestamped transcriptions with optional automatic speaker diarization (tagging who spoke when). The system generates plain-text output with timing markers, enabling users to correlate spoken content with specific moments in the recording. Pricing model for file transcription is not documented; appears to have a paywall separate from the free dictation notepad.
Unique: Integrates file transcription with live dictation in a single web interface, allowing users to mix real-time voice notes with post-hoc file transcription without switching tools. Offers optional speaker diarization as a built-in feature rather than a separate paid add-on, though implementation details are opaque.
vs alternatives: More accessible than Otter.ai for casual users (no subscription required for dictation), but lacks Otter's advanced features (speaker identification, keyword search, integration with calendar/email) and likely has lower accuracy on complex audio.
voice command syntax for punctuation and formatting
Interprets voice commands (e.g., 'period', 'comma', 'new line', 'capitalize next word') spoken during dictation and converts them into corresponding punctuation marks or formatting actions in the transcribed text. The system maintains a command vocabulary and applies formatting rules in real-time or post-processing. Specific command syntax, supported commands, and whether commands are language-specific are not documented.
Unique: Enables hands-free punctuation and formatting during dictation by interpreting voice commands, reducing the need for manual post-editing. Treats punctuation as a first-class concern in the dictation workflow rather than a post-processing step.
vs alternatives: More integrated into the dictation experience than manual editing, but less sophisticated than Dragon NaturallySpeaking's command system (which includes system-wide voice control) or Otter.ai's intelligent punctuation (which adds punctuation automatically without explicit commands).
ios accessibility app (texthear) for hearing-impaired users
A separate iOS application (TextHear) designed specifically for hearing-impaired users, converting speech from others into real-time text on the user's iPhone. The app captures audio from the environment or a conversation partner's microphone, transcribes it in real-time, and displays the text on the screen, enabling deaf or hard-of-hearing users to participate in conversations. Pricing and feature parity with the main Speechnotes app are not documented.
Unique: Purpose-built for accessibility use cases (hearing-impaired users) rather than general dictation, with a dedicated app and UI optimized for real-time conversation transcription. Demonstrates Speechnotes' commitment to accessibility beyond the core dictation use case.
vs alternatives: Specialized for accessibility use cases, but likely less feature-rich than general-purpose transcription apps and with unclear real-time performance compared to specialized accessibility solutions.
human transcription service partnership with bulk discounts
Offers a partnership with a human transcription service providing professional transcription at $0.80/minute, with a 10% discount coupon available to Speechnotes users. The system enables users to request human transcription for content where AI accuracy is insufficient, with results delivered through the Speechnotes interface or directly from the partner. Turnaround time, quality guarantees, and integration with the AI transcription workflow are not documented.
Unique: Bridges AI and human transcription in a single platform, allowing users to start with fast AI transcription and escalate to human transcription for accuracy-critical content. Provides a fallback path for users whose audio is poorly handled by AI, reducing the need to switch to specialized services.
vs alternatives: More convenient than separately contracting human transcription services, but more expensive than pure AI transcription and with unclear integration into the main workflow.
youtube and web-based audio link transcription
Accepts URLs pointing to YouTube videos, podcasts, or other web-hosted audio content, extracts the audio stream server-side, and returns a transcription. The system handles URL parsing and audio extraction without requiring the user to download files locally, enabling quick transcription of public web content. Implementation details (whether using YouTube API, direct stream capture, or third-party extraction service) are not documented.
Unique: Eliminates the download step for web-hosted content by accepting URLs directly and handling extraction server-side, reducing friction compared to tools requiring local file downloads. Integrates seamlessly with the same notepad interface as live dictation and file uploads.
vs alternatives: More convenient than Otter.ai for one-off YouTube transcription (no account creation), but lacks Otter's native YouTube integration with automatic transcript syncing and speaker identification.
ai-powered transcription summarization
Automatically generates concise summaries of transcribed content (from live dictation, file uploads, or URL extraction) using an unspecified AI model. The system analyzes the full transcription and produces a condensed version highlighting key points, enabling users to quickly grasp the essence of longer recordings without reading the entire transcript. Implementation approach (extractive vs. abstractive summarization, model architecture) is not documented.
Unique: Integrates summarization as a post-processing step on transcriptions rather than as a separate tool, allowing users to request summaries on-demand after transcription completes. Treats summarization as a value-add feature alongside transcription rather than a standalone service.
vs alternatives: More convenient than manually copying transcripts into ChatGPT or Claude for summarization, but likely less customizable and with no visibility into model quality or hallucination risk.
multi-language transcription and translation
Transcribes audio in non-English languages and optionally translates the resulting text into English or other target languages. The system claims to support 'all languages' but specific language coverage is not documented. Translation approach (whether using a separate translation model or integrated speech-to-text-to-translation pipeline) is not specified. Output includes both original-language transcription and translated text.
Unique: Combines transcription and translation in a single workflow, avoiding the need to transcribe first and then translate separately. Positions multilingual support as a core feature rather than an add-on, though implementation details suggest it may be a thin wrapper around standard translation APIs.
vs alternatives: More integrated than using separate transcription and translation tools, but likely less accurate than specialized services like Google Translate or DeepL for translation quality.
+5 more capabilities