binary function analysis and annotation via llm-assisted disassembly
Leverages Ghidra's native disassembly engine to extract function boundaries, control flow graphs, and decompiled pseudocode, then pipes structured representations to LLMs for semantic analysis and naming. Uses Ghidra's Java API to traverse the program database (PDB), extract function signatures, and apply AI-generated annotations back to the binary without manual re-analysis.
Unique: Directly integrates with Ghidra's Java API and program database to extract and re-annotate binaries in-place, avoiding export/import cycles and preserving analysis state across sessions
vs alternatives: Tighter integration with Ghidra than standalone tools like Cutter or IDA plugins, enabling bidirectional annotation flow and access to Ghidra's full decompilation pipeline
cross-reference graph traversal and data-flow tracing
Exposes Ghidra's reference graph (xrefs) as queryable MCP tools, allowing LLMs to trace data flow, call chains, and memory access patterns across the binary. Implements depth-limited graph traversal to prevent explosion, with support for filtering by reference type (read, write, call, flow) and scope (function-local, module-wide, global).
Unique: Implements lazy graph expansion with configurable depth limits and reference-type filtering, allowing LLMs to iteratively explore relationships without overwhelming context or hitting API limits
vs alternatives: More granular control over graph traversal than Ghidra's GUI-based xref viewer, enabling programmatic exploration suitable for LLM-driven analysis loops
interactive llm-guided reverse engineering with multi-turn context
Maintains conversation context across multiple analysis queries, allowing LLMs to build understanding incrementally. Implements context management to track analyzed functions, inferred types, and previous findings, enabling coherent multi-turn analysis workflows without redundant re-analysis.
Unique: Maintains stateful analysis context across turns, enabling LLMs to build understanding incrementally without re-analyzing previously-examined code
vs alternatives: Stateful context management enables more natural conversational analysis than stateless query-response patterns
architecture and calling convention detection with function signature inference
Detects binary architecture (x86, ARM, MIPS, etc.) and calling convention (cdecl, stdcall, fastcall, etc.) using Ghidra's analysis, then infers function signatures based on parameter passing patterns. Generates type-safe function prototypes suitable for re-implementation or API documentation.
Unique: Infers function signatures from parameter passing patterns and calling convention analysis, enabling generation of type-safe prototypes without manual annotation
vs alternatives: Automated signature inference reduces manual work compared to manual prototype definition
obfuscation detection and deobfuscation assistance
Detects common obfuscation techniques (control flow flattening, dead code injection, string encryption, etc.) using pattern matching and heuristics. Provides deobfuscation hints and assists LLMs in understanding obfuscated code by highlighting suspicious patterns and suggesting analysis strategies.
Unique: Combines pattern detection with heuristic analysis to identify obfuscation techniques and provide deobfuscation guidance, rather than just flagging suspicious code
vs alternatives: Provides actionable deobfuscation hints alongside detection, enabling LLMs to assist in understanding obfuscated code
decompilation-to-pseudocode extraction with language-specific formatting
Wraps Ghidra's decompiler to extract high-level pseudocode for functions, with options to format output as C, Python, or pseudo-assembly for different analysis contexts. Handles decompiler failures gracefully by falling back to raw disassembly, and caches decompilation results to avoid redundant computation.
Unique: Offers multiple output formats (C, Python, pseudo-assembly) optimized for different LLM comprehension profiles, rather than single-format decompilation output
vs alternatives: More flexible output formatting than Ghidra's native decompiler, enabling downstream LLM processing without manual syntax conversion
memory layout and data structure inference from binary
Analyzes Ghidra's type inference engine and data-type definitions to extract inferred struct layouts, class hierarchies, and memory organization. Reconstructs data structures from memory access patterns and type annotations, exposing them as queryable JSON schemas for LLM-driven reverse engineering of complex data types.
Unique: Exposes Ghidra's internal type inference engine as queryable MCP tools, allowing LLMs to iteratively refine type understanding through multi-turn analysis
vs alternatives: Programmatic access to Ghidra's type system is rare; most tools require manual struct definition or export/import workflows
string and constant extraction with context and usage analysis
Scans the binary for embedded strings, numeric constants, and immediate values, then correlates them with their usage sites (function calls, memory writes, comparisons). Returns structured data including string encoding (ASCII, UTF-16, etc.), cross-references, and inferred purpose based on context.
Unique: Correlates strings with their usage context (function calls, memory operations) and infers purpose based on surrounding code patterns, rather than returning isolated string lists
vs alternatives: More contextual than simple string dumping tools; provides usage analysis that helps LLMs understand string significance
+5 more capabilities