via “document preprocessing and embedding with pluggable converters and embedders”
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and
Unique: Implements document processing as a composable pipeline of converters, splitters, and embedders that can be chained and reused. Supports 10+ file formats natively and allows custom converters for domain-specific formats. Metadata is preserved through the pipeline and attached to chunks, enabling filtered retrieval.
vs others: More flexible than LlamaIndex's document loaders because splitting and embedding are separate, swappable stages; more comprehensive than LangChain's text splitters because it includes format-specific converters and metadata preservation.