hybrid keyword-semantic search with weighted fusion
Executes simultaneous full-text and vector similarity searches, then combines results using a configurable semanticRatio parameter that weights keyword relevance against semantic similarity. The milli crate maintains separate inverted indexes (word_docids, word_pair_proximity_docids) for keyword matching and arroy vector stores for embedding-based retrieval, with fusion logic that merges ranked result sets at query time. This dual-index approach enables applications to balance exact-match precision with semantic understanding without requiring separate search infrastructure.
Unique: Uses weighted fusion of separate inverted indexes (for keyword) and arroy vector stores (for semantic) with configurable semanticRatio parameter, enabling per-index tuning of keyword vs. semantic weight without requiring external ranking services or re-indexing
vs alternatives: Faster than Elasticsearch's hybrid search because Meilisearch's Rust-based milli engine pre-computes both index types at ingest time rather than computing similarity scores at query time, achieving sub-50ms latency on large datasets
asynchronous task-based document indexing with automatic batching
All write operations (document additions, deletions, index creation, settings changes) are enqueued as tasks in the IndexScheduler, which batches and processes them asynchronously in the background. The scheduler implements intelligent batching logic that groups related operations (e.g., multiple document upserts) into single indexing jobs, reducing overhead and improving throughput. Documents flow through a parallel extraction pipeline in the milli crate that tokenizes text via charabia, builds inverted indexes, and creates vector indexes using arroy, with progress tracked via task status endpoints.
Unique: IndexScheduler implements intelligent automatic batching of write operations with configurable batch sizes and timeouts, processing multiple document updates as single indexing jobs to amortize overhead, rather than indexing each operation individually like traditional search engines
vs alternatives: More efficient than Solr's update handlers because Meilisearch batches writes automatically and processes them in parallel via the milli crate's extraction pipeline, achieving higher document throughput without manual batch size tuning
restful http api with openapi specification
Exposes all search, indexing, and administrative functionality through a RESTful HTTP API built on actix-web, with complete OpenAPI 3.0 specification for API documentation and client generation. The API follows REST conventions for resource management (indexes, documents, tasks) with standard HTTP methods (GET, POST, PUT, DELETE) and status codes. The OpenAPI spec is automatically validated and published, enabling API-first development and integration with API documentation tools.
Unique: Provides complete OpenAPI 3.0 specification with automated validation and publication, enabling API-first development and client generation in multiple languages, with actix-web HTTP server handling all REST operations (search, indexing, task management)
vs alternatives: More developer-friendly than Elasticsearch's REST API because Meilisearch's OpenAPI spec is automatically validated and published, and the API is simpler and more consistent, reducing the learning curve for new integrations
task queue and webhook notifications for asynchronous operations
Implements a task queue system where all write operations are enqueued and processed asynchronously, with webhook support for notifying external systems when tasks complete. The IndexScheduler manages the task queue, persisting task state to LMDB and processing tasks in batches. Applications can poll task status endpoints or subscribe to webhooks to receive completion notifications, enabling event-driven architectures where indexing completion triggers downstream processes (e.g., cache invalidation, analytics updates).
Unique: Combines task queue persistence in LMDB with webhook notifications for asynchronous operation completion, enabling event-driven architectures where indexing completion automatically triggers downstream processes without polling
vs alternatives: More integrated than Elasticsearch's task management because Meilisearch's webhooks are built into the core task system, whereas Elasticsearch requires external monitoring tools or custom polling logic
dump and export functionality for backup and migration
Provides dump and export endpoints that serialize the entire index state (documents, settings, tasks) to a portable format that can be restored on another Meilisearch instance. Dumps include all index metadata, documents, and task history, enabling point-in-time backups and zero-downtime migrations between servers. The dump format is version-aware, allowing upgrades between Meilisearch versions with automatic schema migration.
Unique: Provides version-aware dump format that includes documents, settings, and task history, enabling point-in-time backups and zero-downtime migrations with automatic schema migration between Meilisearch versions
vs alternatives: Simpler than Elasticsearch snapshots because Meilisearch dumps are self-contained files that can be restored on any instance, whereas Elasticsearch snapshots require shared repository configuration and cluster coordination
configurable ranking rules and relevance tuning
Allows customization of document ranking through a configurable ranking rules system that applies multiple ranking criteria in sequence (e.g., exact match, word proximity, attribute position, typo count, sort order). Rules are evaluated in order, with earlier rules taking precedence, enabling fine-grained control over relevance without modifying the search algorithm. The ranking system supports both built-in rules and custom sort expressions, allowing applications to tune relevance based on business logic (e.g., boosting bestsellers, deprioritizing out-of-stock items).
Unique: Implements configurable ranking rules that are evaluated in sequence with earlier rules taking precedence, enabling fine-grained relevance tuning through rule ordering rather than algorithm modification, with support for custom sort expressions
vs alternatives: More transparent than Elasticsearch's BM25 scoring because Meilisearch's ranking rules are explicit and configurable, whereas Elasticsearch's relevance is determined by complex scoring formulas that are harder to understand and tune
instant search ui integration with javascript sdk
Provides InstantSearch.js library that integrates with Meilisearch to enable rapid development of search-as-you-type interfaces with minimal code. The SDK handles query execution, result rendering, facet management, and pagination, with support for popular UI frameworks (React, Vue, Angular). The library abstracts away HTTP request management and provides reactive components that automatically update as users interact with search filters and input.
Unique: Provides InstantSearch.js library with pre-built reactive components for search, facets, and pagination, abstracting HTTP request management and enabling rapid UI development with minimal boilerplate in React, Vue, or Angular
vs alternatives: Faster to implement than custom Elasticsearch integration because InstantSearch.js provides pre-built components and handles request management, whereas Elasticsearch requires custom UI development or third-party libraries like Algolia's InstantSearch
typo-tolerant full-text search with configurable distance thresholds
Implements typo tolerance through the charabia tokenization library, which handles misspellings and character variations during both indexing and query processing. The system builds inverted indexes that support fuzzy matching with configurable Levenshtein distance thresholds (typoTolerance setting), allowing queries like 'speling' to match 'spelling'. The tolerance is applied at the token level during query expansion, where the search engine generates candidate tokens within the distance threshold and retrieves documents containing any of those variants.
Unique: Uses charabia tokenization library with Levenshtein distance-based fuzzy matching applied at token expansion time during query processing, with configurable per-word distance thresholds that adjust based on word length (shorter words get stricter tolerance) rather than fixed global thresholds
vs alternatives: More sophisticated than Elasticsearch's fuzzy query because Meilisearch's charabia tokenizer understands language-specific character variations and applies adaptive distance thresholds, reducing false positives while maintaining recall on genuine typos
+7 more capabilities