voice-to-code prompt submission with stt/tts pipeline
Accepts voice messages via Telegram, transcribes them to text using configurable STT providers (Whisper, Google Cloud Speech-to-Text, or local alternatives), sends the transcribed prompt to OpenCode as a coding task, and streams back responses with optional TTS synthesis for voice playback. The pipeline integrates grammy's voice message handling with the @opencode-ai/sdk's event stream, buffering audio chunks and managing provider-specific authentication and format conversion.
Unique: Implements a bidirectional voice pipeline that bridges Telegram's voice message API with OpenCode's SSE event stream, supporting multiple STT/TTS providers via environment-based configuration and managing audio format conversion (Telegram OGG → provider-specific format) without intermediate file storage.
vs alternatives: Unlike OpenClaw's web-only interface, this bot enables voice-first mobile interaction with local OpenCode execution, reducing context switching for developers on the go.
real-time sse event aggregation and pinned status message
Consumes Server-Sent Events (SSE) from the OpenCode SDK's event stream, aggregates multi-event sequences (task start, model selection, context consumption, file changes, task completion) into a single coherent state, and maintains a persistent pinned Telegram message that updates in-place with live metrics: token usage, context window consumption, list of modified files, and agent status. Uses a SummaryAggregator class to deduplicate events, calculate deltas, and format structured data into Telegram's MarkdownV2 syntax.
Unique: Implements a SummaryAggregator pattern that deduplicates and coalesces SSE events into a single mutable pinned message, avoiding Telegram chat spam while maintaining real-time visibility. Uses MarkdownV2 formatting with careful escaping to render structured metrics (token counts, file diffs) in a mobile-friendly compact layout.
vs alternatives: Provides better observability than OpenClaw's web dashboard for mobile users by consolidating multi-event sequences into a single pinned status, reducing API calls and chat clutter while maintaining real-time updates.
daemon mode execution with systemd integration
Supports running the bot as a background daemon process on Linux/macOS using systemd or similar process managers. Provides configuration templates and setup guides for systemd service files, environment variable management, and log rotation. Enables the bot to start automatically on system boot and restart on failure, making it suitable for always-on local execution.
Unique: Provides systemd service templates and setup guides that enable the bot to run as a background daemon with automatic restart on failure, suitable for always-on local execution without manual intervention.
vs alternatives: Enables production-grade deployment of the bot as a local service, unlike OpenClaw's web-only model which requires manual server management.
error handling and recovery with user-friendly error messages
Implements comprehensive error handling for common failure scenarios: OpenCode server unavailable, invalid session/project, task submission errors, SSE connection drops, and API rate limits. Translates technical errors into user-friendly Telegram messages with suggested remediation steps (e.g., 'Server is offline, please check localhost:8000'). Includes retry logic for transient failures and graceful degradation when features are unavailable.
Unique: Translates technical errors into user-friendly Telegram messages with remediation suggestions, implementing retry logic for transient failures and graceful degradation for unavailable features.
vs alternatives: Provides better error visibility and recovery than OpenClaw's web interface, with mobile-friendly error messages and automatic retry logic for common failures.
cli argument parsing and environment configuration
Provides a command-line interface (CLI) for starting the bot with configurable options: Telegram token, OpenCode server URL, STT/TTS provider selection, locale, and logging level. Parses arguments using a custom args parser, validates configuration, and loads environment variables from .env files. Supports both global npm installation (via npx) and direct execution, with clear error messages for missing or invalid configuration.
Unique: Implements a custom CLI argument parser that validates configuration and loads environment variables, supporting both npx and global npm installation with clear error messages for missing or invalid options.
vs alternatives: Provides flexible configuration management that OpenClaw's web interface doesn't support, allowing developers to customize bot behavior via CLI arguments and environment variables.
interactive agent question handling with inline button state machine
Implements a state machine that intercepts OpenCode agent questions and permission requests (e.g., 'Should I modify this file?', 'Which model should I use?') via SSE events, renders them as Telegram inline keyboard buttons, captures user responses, and sends them back to OpenCode via the SDK's interaction API. The Interaction Guard class manages state transitions, prevents concurrent interactions, and ensures responses are routed to the correct agent context (session, project, task).
Unique: Uses a dedicated Interaction Guard state machine that maps Telegram callback_query events to OpenCode SDK interaction responses, preventing concurrent interactions and ensuring responses are routed to the correct task context. Integrates grammy's callback_query handler with the SDK's interaction API, managing the full round-trip from question to response.
vs alternatives: Enables mobile-first approval workflows that OpenClaw's web interface doesn't support, allowing developers to respond to agent questions from anywhere without returning to their desktop.
session and project context switching with git worktree management
Provides commands to list, create, and switch between OpenCode sessions and projects, mirroring the TUI's session management. Internally uses the OpenCode SDK to query available projects, manage git worktrees (creating isolated working directories for parallel work), and maintain session state (current project, branch, uncommitted changes). Stores session context in memory and persists it across bot restarts via environment variables or a local state file.
Unique: Mirrors OpenCode TUI's session management by wrapping the SDK's project and session APIs, providing Telegram commands that abstract away git worktree creation and branch switching. Maintains session state in memory with optional persistence, allowing users to manage multiple projects without manual git operations.
vs alternatives: Provides mobile-friendly project switching that OpenClaw doesn't expose, allowing developers to manage multiple concurrent feature branches directly from Telegram without returning to the CLI.
natural language task scheduling with cron expression generation
Accepts natural language scheduling descriptions (e.g., 'every Monday at 9am', 'daily at 3pm', 'once tomorrow at 2pm') via Telegram message, parses them using a scheduling library (likely node-cron or similar), generates cron expressions, and registers recurring or one-time tasks with the OpenCode server. The bot stores scheduled task definitions and executes them on a schedule, submitting the associated coding prompt to OpenCode at the specified time.
Unique: Implements natural language scheduling that converts user-friendly descriptions into cron expressions, storing task definitions and executing them on a schedule. Integrates with OpenCode's task submission API to run coding tasks at specified times without requiring manual CLI invocation.
vs alternatives: Provides lightweight task scheduling without a full CI/CD pipeline, allowing developers to automate routine coding tasks directly from Telegram with natural language syntax instead of cron syntax.
+5 more capabilities