colang-based dialog flow definition and state machine execution
Defines conversational flows using Colang, a domain-specific language that compiles to state machines for managing dialog turns, branching logic, and context transitions. The Colang 2.x runtime executes these flows as event-driven state machines, processing user messages through defined states and triggering actions based on flow conditions. This enables declarative specification of multi-turn conversations without imperative control flow.
Unique: Uses a custom DSL (Colang) that compiles to event-driven state machines rather than relying on generic workflow engines; Colang 2.x introduces a complete rewrite with improved state semantics and event processing compared to 1.0
vs alternatives: More expressive than rule-based dialog systems and more maintainable than hand-coded state machines, but requires learning a new language unlike generic orchestration frameworks
multi-stage input/output/dialog/retrieval/tool rails pipeline
Implements a configurable pipeline of safety and constraint enforcement layers that process requests before LLM invocation (input rails), after LLM generation (output rails), during dialog turns (dialog rails), before retrieval operations (retrieval rails), and around tool calls (tool rails). Each rail stage can apply custom validators, filters, and transformations using Python actions or LLM-based checks, enabling fine-grained control over what enters and exits the LLM.
Unique: Implements a staged pipeline architecture with separate rail types (input/output/dialog/retrieval/tool) rather than a monolithic filter, allowing different safety policies at different points in the request lifecycle; supports both rule-based and LLM-based enforcement
vs alternatives: More comprehensive than single-stage content filters and more flexible than hardcoded safety checks, but requires more configuration than simple prompt-based safety approaches
embeddings and vector store integration for rag and semantic search
Integrates with embedding models (OpenAI, Hugging Face, local models) and vector stores (Chroma, Pinecone, FAISS) to support semantic search and retrieval-augmented generation (RAG). Handles embedding generation, vector storage, similarity search, and result ranking. Supports both in-memory and persistent vector stores, enabling guardrails to retrieve relevant context for fact-checking, topic validation, and knowledge-based responses.
Unique: Integrates embeddings and vector stores as first-class components in guardrails, enabling semantic search and fact-checking without requiring separate RAG frameworks; supports multiple embedding models and vector store backends
vs alternatives: More integrated than generic RAG libraries and more flexible than hardcoded knowledge bases, but requires careful tuning of embedding models and similarity thresholds
observability and tracing with span management and llm call tracking
Provides built-in observability through span-based tracing that tracks request flow, LLM calls, action execution, and rail decisions. Integrates with OpenTelemetry for distributed tracing, logs detailed execution traces, and supports exporting traces to external systems (Datadog, Jaeger, etc.). Enables debugging of complex guardrail flows and performance monitoring of LLM calls.
Unique: Implements span-based tracing integrated with OpenTelemetry rather than simple logging, enabling distributed tracing across microservices and detailed performance analysis of guardrail execution
vs alternatives: More comprehensive than basic logging and more integrated than external monitoring tools, but adds complexity and overhead compared to simple print statements
langchain integration with custom chain and agent support
Provides seamless integration with LangChain chains and agents, allowing guardrails to wrap LangChain components or be wrapped by them. Supports using LangChain tools within guardrails, integrating guardrails into LangChain agent loops, and sharing context between guardrails and chains. Enables building complex agentic systems with guardrails applied at multiple points in the execution flow.
Unique: Provides first-class LangChain integration that allows guardrails to wrap chains or be wrapped by them, rather than requiring manual integration code; supports bidirectional context passing
vs alternatives: More integrated than generic wrapper patterns and more flexible than LangChain's built-in safety features, but requires understanding both frameworks
cli tools for configuration validation, testing, and deployment
Provides command-line tools for validating guardrail configurations, running tests, generating documentation, and deploying guardrails. Includes commands for checking YAML syntax, validating Colang flows, running test suites, and generating API documentation. Enables CI/CD integration and local development workflows without requiring Python code.
Unique: Provides dedicated CLI tools for guardrail-specific operations (config validation, Colang testing) rather than relying on generic Python testing frameworks; enables non-Python users to validate configurations
vs alternatives: More convenient than writing Python test code and more integrated than generic YAML validators, but less flexible than programmatic testing
llm-based self-check mechanisms for hallucination and jailbreak detection
Uses secondary LLM calls to validate outputs and detect attacks through structured prompting. Implements jailbreak detection by analyzing user inputs against known attack patterns, and hallucination detection by having the LLM verify its own outputs against retrieved facts or user context. These checks run asynchronously or synchronously depending on configuration, using the same LLM provider or a separate safety-focused model.
Unique: Implements LLM-based validation as a first-class rail type with support for specialized safety models (Nemotron Safety Guard, Nemotron Content Safety) rather than relying solely on rule-based detection; includes reasoning trace extraction for explainability
vs alternatives: More context-aware than regex/keyword-based jailbreak detection, but slower and more expensive than rule-based approaches; more reliable than single-model safety but requires careful prompt design
topic control and content safety classification with embeddings
Uses semantic embeddings (via configurable embedding models) to classify user messages and LLM outputs against allowed topics and content categories. Compares input/output embeddings against a knowledge base of topic examples or safety categories, using cosine similarity thresholds to determine if content is on-topic or violates safety policies. This enables semantic understanding beyond keyword matching, supporting nuanced topic boundaries and content policies.
Unique: Implements semantic topic control via embeddings rather than keyword lists or regex patterns, allowing nuanced topic boundaries; integrates with configurable embedding models and vector stores for scalable topic management
vs alternatives: More semantically aware than keyword-based topic filtering and more flexible than rule-based systems, but requires careful example curation and threshold tuning unlike supervised classification models
+6 more capabilities