Paper - ChatDev: Communicative Agents for Software Development
Product[Local demo](https://github.com/OpenBMB/ChatDev/blob/main/wiki.md#local-demo)
Capabilities10 decomposed
multi-agent software development orchestration
Medium confidenceCoordinates multiple specialized AI agents (CEO, CTO, programmer, tester) through a role-based communication protocol where each agent has distinct responsibilities and communicates via structured message passing. Agents maintain conversation history and context across development phases (requirements analysis, architecture design, implementation, testing), with a central coordinator managing task delegation and phase transitions based on agent outputs.
Uses role-based agent specialization (CEO for planning, CTO for architecture, Programmer for implementation, Tester for validation) with explicit phase-based workflow rather than treating all agents as interchangeable — each agent has domain-specific prompting and output constraints that map to SDLC stages
Differs from single-model code generation (Copilot, Codex) by decomposing software development into sequential phases with specialized agents, enabling intermediate review points and architectural validation before implementation begins
agent-to-agent communication protocol with memory
Medium confidenceImplements a structured message-passing system where agents exchange information through a shared conversation history that persists across turns. Each agent reads prior messages, generates responses following role-specific templates, and appends to a growing transcript. The protocol includes semantic routing — agents can reference specific prior messages and the system maintains context windows to prevent token overflow while preserving critical architectural decisions.
Uses a linear conversation transcript as the primary state mechanism rather than a structured knowledge graph or vector database — all agent decisions are grounded in the readable conversation history, making the system interpretable but less efficient for large projects
More transparent than blackbox multi-agent systems (e.g., AutoGPT) because the entire reasoning chain is human-readable; less efficient than systems using vector embeddings for context retrieval because it requires full transcript processing each turn
phase-based software development workflow
Medium confidenceDecomposes software development into discrete phases (requirements analysis, architecture design, implementation, testing) where each phase has specific agent responsibilities and success criteria. The system enforces phase ordering — agents cannot proceed to implementation until architecture is approved, and testing only occurs after code generation. Phase transitions are triggered by agent outputs meeting implicit quality thresholds or explicit approval signals.
Explicitly models SDLC phases as first-class workflow constructs with agent-to-phase bindings, rather than treating development as a single continuous task — each phase has dedicated agents and outputs that feed into subsequent phases
More structured than prompt-chaining approaches (which treat all steps equally) but less flexible than iterative refinement systems that allow backtracking and phase reordering
role-based agent specialization with domain prompting
Medium confidenceAssigns distinct roles to agents (CEO for strategic planning, CTO for technical architecture, Programmer for implementation, Tester for validation) and uses role-specific system prompts that constrain each agent's behavior and output format. The CEO agent synthesizes requirements and delegates tasks; the CTO designs architecture and validates feasibility; the Programmer implements based on specifications; the Tester generates test cases and validates correctness. Each role has implicit constraints on what outputs are acceptable.
Uses explicit role definitions tied to software development positions (CEO, CTO, Programmer, Tester) rather than generic agent archetypes — each role has domain-specific knowledge and constraints that map to real job functions
More interpretable than generic multi-agent systems because roles are familiar to developers; less flexible than systems with dynamic role assignment because roles are fixed at initialization
code generation from architectural specifications
Medium confidenceTranslates high-level architecture designs (produced by the CTO agent) into executable source code through a Programmer agent that reads architectural constraints, module definitions, and API specifications. The Programmer generates code that adheres to the specified architecture, including file structure, module boundaries, and inter-module communication patterns. The system supports multiple programming languages and generates complete, runnable projects rather than code snippets.
Generates code as a downstream artifact of explicit architecture design rather than generating code directly from requirements — the architecture phase acts as an intermediate specification layer that constrains code generation
More architecturally consistent than direct requirement-to-code generation (Copilot) because it enforces design constraints; slower than single-step generation because it requires architecture design first
automated test generation and validation
Medium confidenceA Tester agent automatically generates test cases based on code specifications and implementation details, then validates the generated code against those tests. The Tester reads the implementation code, infers test scenarios from function signatures and documented behavior, generates test cases in the appropriate framework (pytest, Jest, etc.), and reports pass/fail results. The system can identify bugs in generated code and flag them for developer review.
Uses an LLM-based Tester agent to generate tests rather than using static analysis or symbolic execution — tests are inferred from code semantics and documented behavior, enabling detection of logical errors not just syntax errors
More comprehensive than static analysis (which only finds syntax errors) but less rigorous than formal verification (which requires mathematical proofs); faster than manual test writing but may miss edge cases
requirements-to-specification translation
Medium confidenceA CEO agent reads natural language project requirements and translates them into structured specifications that guide downstream agents. The CEO analyzes requirements for completeness, identifies ambiguities, decomposes high-level goals into concrete tasks, and produces a specification document that includes functional requirements, non-functional constraints, and success criteria. This specification becomes the input for the CTO's architecture design phase.
Uses an LLM agent (CEO) to perform requirements analysis rather than using formal requirement elicitation techniques — the analysis is conversational and produces natural language specifications that other agents can understand
More flexible than template-based requirement capture (which requires predefined categories) but less rigorous than formal specification languages (which require mathematical precision)
architecture design with feasibility validation
Medium confidenceA CTO agent designs software architecture based on specifications, proposing module structure, component interactions, technology choices, and design patterns. The CTO validates architectural feasibility by checking for circular dependencies, ensuring modules are cohesive, and confirming that the design can be implemented with available technologies. The architecture is documented in a format that the Programmer agent can use to generate code, including module definitions, APIs, and inter-module communication patterns.
Uses an LLM-based CTO agent to design architecture with implicit feasibility validation rather than using formal architecture description languages — the design is expressed in natural language and validated through reasoning rather than formal methods
More interpretable than automated architecture synthesis tools (which may produce opaque designs) but less formally verified than architecture frameworks using formal specification languages
multi-language code generation with language-specific patterns
Medium confidenceGenerates executable code in multiple programming languages (Python, JavaScript, Java, C++, etc.) by using language-specific code generation templates and patterns. The system understands language idioms, standard libraries, and framework conventions for each target language, producing idiomatic code rather than direct translations. The Programmer agent selects appropriate language features and design patterns based on the target language's strengths.
Generates language-idiomatic code rather than language-agnostic code translated to each language — the system understands language-specific patterns, standard libraries, and conventions for each target language
More idiomatic than template-based code generation (which produces generic code) but requires more LLM knowledge per language; more flexible than single-language generators but harder to maintain
conversation-based refinement and clarification
Medium confidenceAgents can request clarification from users or other agents when specifications are ambiguous or incomplete. The system maintains a conversation interface where agents ask questions, users provide answers, and those answers are incorporated into the specification. This creates an iterative refinement loop where the system progressively clarifies requirements and specifications through dialogue rather than requiring complete specifications upfront.
Uses agents to actively ask clarification questions rather than passively accepting incomplete specifications — the system drives the conversation to gather missing information
More interactive than batch specification processing but requires user availability; more flexible than rigid specification templates but less structured than formal requirement elicitation
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with Paper - ChatDev: Communicative Agents for Software Development, ranked by overlap. Discovered automatically through the match graph.
Phidata
Agent framework with memory, knowledge, tools — function calling, RAG, multi-agent teams.
Agents
Library/framework for building language agents
AgentDock
Unified infrastructure for AI agents and automation. One API key for all services instead of managing dozens. Build production-ready agents without operational complexity.
pro-workflow
Claude Code learns from your corrections: self-correcting memory that compounds over 50+ sessions. Context engineering, parallel worktrees, agent teams, and 17 battle-tested skills.
PraisonAI
A framework for building multi-agent AI systems with workflows, tool integrations, and memory. #opensource
Semantic Kernel
Microsoft's SDK for integrating LLMs into apps — plugins, planners, and memory in C#/Python/Java.
Best For
- ✓teams prototyping rapid MVP generation workflows
- ✓researchers studying multi-agent collaboration patterns
- ✓developers building LLM-based software factories
- ✓developers building transparent multi-agent systems
- ✓researchers analyzing agent collaboration patterns
- ✓teams needing audit trails of AI-driven development decisions
- ✓organizations enforcing SDLC governance through AI
- ✓teams building code generation tools that require quality gates
Known Limitations
- ⚠Agent coordination adds latency — each phase requires sequential agent turns, making total generation time 5-10x longer than single-model generation
- ⚠No persistent state management between runs — context is ephemeral within a single generation session
- ⚠Quality degrades on complex requirements (>500 tokens) due to context window constraints in agent communication
- ⚠No built-in rollback or iterative refinement — if an agent produces incorrect output, entire downstream phases are affected
- ⚠Context window management is manual — no automatic summarization, so long projects may exceed token limits
- ⚠Message ordering is strictly sequential — no parallel agent execution or concurrent task handling
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
About
[Local demo](https://github.com/OpenBMB/ChatDev/blob/main/wiki.md#local-demo)
Categories
Alternatives to Paper - ChatDev: Communicative Agents for Software Development
Are you the builder of Paper - ChatDev: Communicative Agents for Software Development?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →