agent loop orchestration with llm perception-action cycles
Implements a minimal but complete agent loop pattern where an LLM (Claude) perceives environment state, reasons about next actions, and executes tool calls in a synchronous request-response cycle. The harness captures tool outputs as observations, feeds them back into the next loop iteration, and maintains conversation history across cycles. This is the foundational pattern taught in s01 and reused throughout all 12 sessions.
Unique: Explicitly separates the agent (the LLM model) from the harness (tools, state, permissions) as a pedagogical principle, making the loop pattern visible and modifiable without conflating model training with environment design. Most frameworks blur this distinction.
vs alternatives: Clearer mental model than frameworks like LangChain or AutoGPT because it isolates the loop pattern and teaches harness engineering as a distinct discipline, not just LLM API wrapping.
tool dispatch with schema-based function calling
Routes LLM-generated tool calls to concrete implementations (bash, read_file, write_file, edit_file, load_skill, task_* operations) via a schema registry that defines input/output contracts. The harness validates tool schemas against LLM requests, executes the tool in an isolated context, captures output, and returns it to the agent. This is taught in s02 and extended throughout the curriculum.
Unique: Implements a two-layer tool injection strategy (s05) where tools are defined as both schema (for LLM awareness) and implementation (for execution), allowing the harness to validate and sandbox tool calls before execution. This decoupling is rarely explicit in other frameworks.
vs alternatives: More transparent than OpenAI function calling because the schema and implementation are separately visible, making it easier to audit what tools the agent can actually invoke and how they're constrained.
autonomous task claiming and work distribution
Implements a task claiming mechanism (s11) where agents autonomously claim tasks from a shared task board based on their capabilities and current workload. Agents can evaluate task requirements, decide whether to claim a task, and update task status. This enables self-organizing agent teams without a central scheduler.
Unique: Gives agents agency in task selection rather than assigning tasks from above. Agents evaluate task requirements and decide autonomously, making the system more adaptive to agent capabilities and workload.
vs alternatives: More flexible than centralized task assignment because agents can adapt to changing conditions and new capabilities. Requires less coordination overhead but may be less optimal in terms of global load balancing.
worktree isolation and filesystem sandboxing
Implements WorktreeManager (s12) that creates isolated filesystem subtrees for each agent or task, preventing cross-contamination and enabling parallel execution. Each worktree is a separate directory with its own file state, and agents can only access files within their worktree. This is the final session and combines all previous concepts into a complete isolated execution environment.
Unique: Combines path validation (s01) with filesystem-level isolation, creating a complete sandbox where agents can safely modify files without affecting other agents or the host system. This is the culmination of all previous security and isolation patterns.
vs alternatives: More complete than simple path validation because it provides true isolation at the filesystem level. Agents can be run in parallel without coordination, unlike shared-filesystem approaches that require locks or careful ordering.
pedagogical progression through 12 learning sessions
Structures the entire framework as a 12-session curriculum (s01–s12) where each session introduces exactly one harness mechanism without modifying the core agent loop. Sessions build incrementally: s01 teaches the loop, s02 adds tools, s03 adds planning, s04 adds subagents, s05 adds skills, s06 adds compression, s07 adds tasks, s08 adds background execution, s09 adds teams, s10 adds protocols, s11 adds autonomous claiming, s12 adds worktree isolation. This design makes the framework explicitly educational and modular.
Unique: Explicitly designs the framework as a teaching tool with a structured progression, rather than a production system. Each session is a minimal, self-contained example that teaches one concept. This is rare — most frameworks prioritize features over pedagogy.
vs alternatives: More educational than production frameworks like LangChain because it isolates concepts and builds understanding incrementally. Trades off feature completeness for clarity and learnability.
safe path validation and dangerous command blocking
Implements a permission layer that validates file paths against a safe_path whitelist before executing read/write/edit operations, and blocks dangerous bash commands (rm -rf, sudo, etc.) via a blocklist. The harness intercepts tool calls at dispatch time, checks paths and commands against rules, and rejects unsafe operations before they reach the OS. This is a core security mechanism taught in the overview and applied throughout.
Unique: Combines filesystem-level path whitelisting with command-pattern blacklisting, creating a two-layer defense that is simple to understand and audit. Most frameworks either omit this entirely or use complex capability-based security models.
vs alternatives: Simpler and more transparent than capability-based security (like secomp or AppArmor) because rules are human-readable and can be inspected without kernel knowledge, making it suitable for educational and small-scale deployments.
planning and task decomposition via todomanager
Provides a persistent task board (TodoManager) where agents can write, read, and update tasks in a structured format. Tasks are stored as markdown with metadata (status, assignee, priority), and the agent can decompose complex goals into subtasks, track progress, and coordinate with other agents. This is introduced in s03 and extended in s07 (TaskManager) and s09 (multi-agent teams).
Unique: Uses markdown as the task storage format, making tasks human-readable and editable outside the agent system. This is unusual — most frameworks use databases or JSON. The design choice prioritizes transparency over performance.
vs alternatives: More transparent than database-backed task systems because tasks are plain text and can be inspected, edited, or version-controlled directly. Trades off concurrent write safety for simplicity and auditability.
subagent spawning with context isolation
Allows a parent agent to spawn child agents (subagents) with isolated context, separate tool access, and independent task boards. Each subagent runs its own agent loop with a subset of the parent's tools and knowledge, and communicates back via message passing. This is taught in s04 and forms the foundation for multi-agent teams in s09.
Unique: Implements context isolation as a first-class pattern by giving each subagent its own tool registry and knowledge base, rather than sharing the parent's full context. This makes permission boundaries explicit and teachable.
vs alternatives: More explicit about isolation than frameworks like LangChain's SubTask agents, which often share parent context by default. This design forces developers to think about what each agent should know and can do.
+5 more capabilities