pilot-shell
MCP ServerFreeMake Claude Code production-ready — spec-driven plans, enforced quality gates, persistent knowledge
Capabilities13 decomposed
spec-driven task planning with feature/bugfix auto-detection
Medium confidenceAnalyzes user intent via the /spec command, automatically classifies tasks as features or bugfixes, and generates structured implementation plans using a state machine dispatcher that routes to feature or bugfix workflows. The planning phase uses Claude to decompose requirements into atomic steps with estimated complexity, then presents a human-reviewable plan before implementation begins. This enforces upfront design thinking and prevents Claude Code from diverging into ad-hoc implementations.
Uses a dispatcher-based state machine that routes feature and bugfix tasks through separate workflows (feature: plan → implement → verify; bugfix: plan → implement → regression test), with mandatory human approval gates between planning and implementation phases. This architectural pattern prevents Claude from skipping the planning phase entirely.
Unlike Claude Code alone (which implements immediately) or generic AI agents (which lack project context), Pilot Shell enforces structured planning with automatic task classification and blocks implementation until a human approves the plan.
test-driven development enforcement with pre-implementation test generation
Medium confidenceDuring the implementation phase of /spec workflows, generates test cases before code is written, then validates that all generated code passes those tests before marking tasks complete. The system uses a verification agent that runs test suites and blocks code merges if coverage or assertions are insufficient. This is enforced via hooks that intercept code changes and validate test presence before allowing commits.
Integrates test generation into the implementation phase via a hooks pipeline that intercepts code changes and validates test presence before allowing progression. Uses a verification agent that runs test suites and blocks code merges if tests fail or coverage is insufficient, making TDD non-optional rather than optional.
Standard Claude Code has no built-in test enforcement; Pilot Shell's hooks pipeline and verification agent make test-first development automatic and mandatory, preventing developers from skipping tests even if they wanted to.
codebase-aware context injection with selective token budgeting
Medium confidencePilot Shell injects project-specific context into Claude's system prompt at session start, including extracted conventions, relevant code patterns, and project rules from the semantic index. The context injection is selective and respects Claude's token budget — only the most relevant patterns are injected based on the current task, preventing context window overflow. The system uses a context monitor to track which files are most relevant to the current task and prioritizes injection of related patterns.
Uses a context monitor to selectively inject the most relevant project patterns into Claude's system prompt based on task scope, respecting token budgets by prioritizing high-impact patterns. This enables codebase awareness without exceeding context window limits, making large-codebase support practical.
Unlike RAG systems that inject all matching documents (risking token overflow) or manual context setup (which is tedious), Pilot Shell's selective context injection uses task-aware heuristics to inject only the most relevant patterns, balancing context richness with token efficiency.
automated code review and style enforcement
Medium confidenceThe verification phase includes an automated code review agent that checks for style violations, architectural inconsistencies, and deviations from project conventions. The agent uses the extracted project rules and conventions to validate that generated code follows established patterns. Code that violates style or architectural rules is flagged and can block merges, providing automated enforcement of code quality standards without requiring manual review.
Implements an automated code review agent that validates generated code against extracted project rules and conventions, providing architectural and style enforcement without manual review. The agent uses the same rules extracted by /sync and /learn, making reviews consistent with project standards.
Unlike manual code review (which is slow and subjective) or linting tools alone (which only check syntax), Pilot Shell's code review agent understands project conventions and architectural patterns, providing semantic-level code quality assurance.
session state persistence and recovery
Medium confidencePilot Shell persists session state (current task, implementation progress, test results, verification status) to disk, enabling recovery if a session crashes or is interrupted. The worker service maintains a session state file that tracks the current /spec task, implementation phase, and verification results. If a session is interrupted, the next session can resume from the last checkpoint, preventing loss of work and enabling recovery from failures.
Persists session state to disk via the worker service, enabling recovery from crashes and interruptions. Session state includes current task, implementation progress, test results, and verification status, allowing seamless resumption from the last checkpoint.
Unlike Claude Code alone (which has no session persistence) or manual checkpointing (which is error-prone), Pilot Shell's automatic session persistence enables recovery from crashes without user intervention, making long-running tasks more reliable.
persistent session memory with semantic codebase indexing
Medium confidenceThe /sync command builds a semantic search index of the entire codebase using embeddings, then stores project-specific context (architecture patterns, naming conventions, dependencies, test patterns) in a persistent memory store that survives across sessions. This context is automatically injected into Claude's context window at the start of each session, enabling Claude to understand project conventions without requiring manual context setup. The context monitor continuously tracks changes to key files and updates the index incrementally.
Uses a context monitor hook that tracks file changes and incrementally updates the semantic index, combined with a memory & console system that persists extracted conventions across sessions. The index is injected into Claude's context at session start, eliminating the need for manual context setup while staying within token budgets via selective injection of relevant patterns.
Unlike Claude Code alone (which has no persistent memory between sessions) or generic RAG systems (which require manual indexing), Pilot Shell's /sync command automatically indexes the codebase and injects relevant context at session start, making project knowledge persistent without manual effort.
project-specific rules and conventions extraction via /learn
Medium confidenceThe /learn command captures non-obvious discoveries from the current session (e.g., 'this project uses a custom logger instead of console.log', 'all async functions must have timeout handling') and converts them into reusable skill files stored in ~/.pilot/skills/. These skills are automatically loaded into Claude's context for future sessions on the same project, and can be shared across teams via the /vault command. The system uses Claude to extract generalizable patterns from session interactions and format them as structured rules.
Converts session discoveries into structured skill files that are automatically loaded into Claude's context for future sessions, with a /vault integration for team-wide sharing. Unlike generic documentation, skills are machine-readable and directly injected into Claude's reasoning, making them immediately actionable.
Standard Claude Code has no mechanism to capture and reuse project-specific patterns; Pilot Shell's /learn command converts ephemeral session insights into persistent, shareable skills that improve Claude's performance on future tasks in the same project.
team knowledge sharing via /vault with git-backed persistence
Medium confidenceThe /vault command shares rules, commands, skills, hooks, and agents across a team by syncing them to a private Git repository. Each team member's local ~/.pilot/ and ~/.claude/ directories can be configured to pull from a shared vault repository, enabling centralized management of project conventions, custom hooks, and reusable agents. The system uses Git as the backing store and provides conflict resolution via simple merge strategies (last-write-wins or manual resolution).
Uses Git as the backing store for team knowledge, enabling decentralized sync with version history and audit trails. Rules, skills, hooks, and agents are stored as files in the vault repository and pulled into each team member's local ~/.pilot/ directory, making team knowledge portable and version-controlled.
Unlike centralized knowledge bases (which require a server) or manual documentation (which gets out of sync), Pilot Shell's /vault uses Git for decentralized, version-controlled sharing of project-specific rules and agents, making team knowledge portable and auditable.
hooks-based quality enforcement pipeline
Medium confidenceA pre-commit and post-change hooks pipeline that intercepts code modifications and enforces quality standards before code can be committed or merged. The pipeline includes a file checker hook (validates syntax, linting, formatting), a context monitor hook (tracks changes to key files), and a tool redirect hook (intercepts Claude's tool calls and validates them against project rules). Hooks are defined in project-specific or team-wide configuration and are automatically applied to all code changes, making quality enforcement non-optional.
Implements a multi-stage hooks pipeline that runs at different points in the development workflow (file checker on every change, context monitor on key files, tool redirect on Claude's tool calls). Hooks are composable and can be extended with custom scripts, making the quality enforcement system flexible and project-specific.
Unlike pre-commit hooks alone (which only run at commit time) or linting tools (which are passive), Pilot Shell's hooks pipeline actively intercepts code changes and tool calls, enforcing quality standards at multiple points in the workflow and preventing non-compliant code from progressing.
worktree-based isolated task execution
Medium confidenceEach /spec task executes in an isolated Git worktree (a separate working directory linked to the same repository), preventing concurrent tasks from interfering with each other and enabling safe rollback if a task fails. The worktree is created at task start, code changes are made in isolation, and the worktree is merged back to the main branch only after verification passes. This architectural pattern enables safe parallel task execution and provides a natural rollback mechanism if verification fails.
Uses Git worktrees as the isolation mechanism for /spec tasks, enabling safe parallel execution and automatic rollback on verification failure. Each task gets its own working directory linked to the same repository, preventing concurrent tasks from interfering and providing a natural merge point for verification.
Unlike branching (which requires manual branch management and merging) or stashing (which is error-prone), Pilot Shell's worktree-based approach provides automatic isolation and rollback with minimal user intervention, making parallel task execution safe and predictable.
verification and regression testing agent
Medium confidenceAfter implementation completes, a verification agent runs the full test suite, checks for regressions, and validates that the implementation meets the original specification. For bugfixes, the agent specifically checks that the bug is fixed and no new bugs are introduced. For features, the agent validates that all acceptance criteria are met. The agent can block code merges if verification fails, providing a quality gate before code reaches the main branch.
Implements a dedicated verification agent that runs after implementation and validates against the original specification and acceptance criteria. For bugfixes, it specifically checks that the bug is fixed and no regressions are introduced; for features, it validates that all acceptance criteria are met. This provides a structured quality gate before code merges.
Unlike manual testing (which is slow and error-prone) or generic CI/CD pipelines (which lack context about the original specification), Pilot Shell's verification agent understands the original task and validates that the implementation actually solves the problem, providing context-aware quality assurance.
mcp server integration for claude code tool calling
Medium confidencePilot Shell exposes a Model Context Protocol (MCP) server that provides Claude Code with access to Pilot Shell commands (/spec, /sync, /learn, /vault) and project-specific tools via a standardized function-calling interface. The MCP server runs as a background service and handles tool schema registration, argument validation, and execution. This enables Claude Code to invoke Pilot Shell workflows programmatically rather than requiring manual slash command invocation.
Implements an MCP server that exposes Pilot Shell commands and project-specific tools through a standardized function-calling interface, enabling Claude Code to invoke workflows programmatically. The server handles schema registration, argument validation, and execution, making tool integration seamless and standardized.
Unlike manual slash command invocation (which requires user interaction) or custom integrations (which are project-specific), Pilot Shell's MCP server provides a standardized, programmatic interface for Claude to invoke workflows and tools, enabling autonomous execution and better integration with Claude Code's reasoning loop.
quick mode for low-complexity tasks without planning gates
Medium confidenceFor tasks classified as low-complexity, Pilot Shell automatically activates Quick Mode, which bypasses the planning phase and approval gate, allowing direct implementation with quality hooks and TDD enforcement still active. Quick Mode is triggered automatically based on task complexity heuristics (e.g., single-file changes, simple bug fixes) and can be manually invoked with /spec --quick. This provides a fast path for simple tasks while maintaining quality standards.
Automatically detects low-complexity tasks and bypasses the planning phase while maintaining quality hooks and TDD enforcement. This provides a fast path for simple tasks without sacrificing quality standards, balancing speed and safety based on task complexity.
Unlike Claude Code alone (which has no complexity-based routing) or strict planning-first approaches (which add overhead to all tasks), Pilot Shell's Quick Mode provides context-aware routing that speeds up simple tasks while maintaining quality gates for complex work.
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with pilot-shell, ranked by overlap. Discovered automatically through the match graph.
Qwen2.5 Coder 32B Instruct
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvements in **code generation**, **code reasoning**...
Ellipsis
(Previously BitBuilder) "Automated code reviews and bug fixes"
Factory
Coding Droids for building software end-to-end
Codiumate (Qodo Gen)
AI test generation and code integrity analysis.
Claude Opus 4.7, GPT-5.4, Gemini-3.1, Cursor AI, Copilot, Codex,Cline and ChatGPT, AI Copilot, AI Agents and Debugger, Code Assistants, Code Chat, Code Generator, Code Completion, Generative AI, Autoc
Claude Opus 4.7, GPT-5.4, Gemini-3.1, AI Coding Assistant is a lightweight for helping developers automate all the boring stuff like writing code, real-time code completion, debugging, auto generating doc string and many more. Trusted by 100K+ devs from Amazon, Apple, Google, & more. Offers all the
Pagetok
Your AI agent for any project. It plans, edit files, searches and learns from the Internet. Free and effective.
Best For
- ✓teams building production codebases with Claude Code
- ✓developers who want structured planning gates before AI-driven implementation
- ✓projects requiring audit trails of design decisions
- ✓teams with strict TDD requirements or regulatory compliance needs
- ✓projects where test coverage is a non-negotiable quality metric
- ✓developers who want to prevent untested code from entering the codebase
- ✓large codebases with strong architectural patterns
- ✓projects with non-obvious conventions or custom tooling
Known Limitations
- ⚠Requires explicit /spec invocation — does not auto-trigger on unstructured requests
- ⚠Plan approval is synchronous and blocks implementation until human review
- ⚠Feature vs bugfix classification relies on Claude's semantic understanding and may misclassify ambiguous tasks
- ⚠Test generation quality depends on Claude's understanding of requirements — may generate incomplete or redundant tests
- ⚠Requires test framework setup in the project (Jest, pytest, etc.) — does not work with projects lacking test infrastructure
- ⚠Test execution adds latency to the implementation phase (typically 30-60 seconds per test suite run)
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
Repository Details
Last commit: Apr 21, 2026
About
Make Claude Code production-ready — spec-driven plans, enforced quality gates, persistent knowledge
Categories
Alternatives to pilot-shell
Are you the builder of pilot-shell?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →