Local Memory Storage With Sqlite And Embeddings

1

Semantic KernelFramework74/100

via “vector-based semantic memory with pluggable embedding and storage backends”

Microsoft's SDK for integrating LLMs into apps — plugins, planners, and memory in C#/Python/Java.

Unique: Implements a two-tier abstraction (IEmbeddingGenerationService + IMemoryStore) that fully decouples embedding generation from vector storage, allowing independent provider selection. This is more modular than LangChain's VectorStore pattern which couples embedding and storage, and provides better multi-backend support than LlamaIndex's single-backend approach. Exposes memory operations as kernel plugins (TextMemoryPlugin) for native integration with function calling.

vs others: More flexible than LangChain's tightly-coupled embedding+storage pattern, and better integrated with function calling than LlamaIndex, though with less mature vector store support compared to LangChain's ecosystem of 20+ integrations.

2

llm (Simon Willison)CLI Tool57/100

via “embedding generation and semantic search with vector storage”

CLI for LLMs — multi-provider, conversation history, templates, embeddings, plugin ecosystem.

Unique: Separates embedding storage from conversation logs (embeddings.db vs logs.db), allowing independent scaling and querying of embeddings. EmbeddingModel abstraction enables swapping embedding providers without changing application code, and batch operations optimize cost for bulk embedding generation.

vs others: More integrated than using OpenAI's API directly because it provides a unified interface across embedding models and handles storage, and simpler than LangChain's embedding system because it doesn't require external vector databases for basic use cases.

3

mcp-memory-serviceMCP Server49/100

via “hybrid-storage-backend-with-sqlite-and-cloudflare-support”

Open-source persistent memory for AI agent pipelines (LangGraph, CrewAI, AutoGen) and Claude. REST API + knowledge graph + autonomous consolidation.

Unique: Provides a unified storage abstraction that supports both local SQLite and remote Cloudflare infrastructure without code changes, enabling seamless scaling from development to production. Hybrid mode enables local caching with remote persistence, combining the speed of local storage with the durability and scalability of cloud infrastructure.

vs others: More flexible than single-backend solutions because it supports both local and cloud deployments; more cost-effective than always-cloud solutions because local SQLite has zero infrastructure costs for development.

4

claude-memSkill40/100

via “dual-storage persistence with sqlite and chromadb vector embeddings”

A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.

Unique: Implements a dual-storage architecture where SQLite serves as the source-of-truth for structured data and ChromaDB is synced asynchronously via ChromaSync operations. This decouples relational queries from vector search, allowing each store to optimize for its access pattern. Schema migrations are managed explicitly, enabling safe schema evolution without data loss

vs others: More flexible than single-store solutions because it supports both exact filtering (SQL) and semantic search (vectors) without forcing a choice; more reliable than cloud-only memory because data persists locally and survives network outages

5

ssd-aiMCP Server38/100

via “contextual memory management”

AI development assistant that implements the **Model Context Protocol (MCP)** standard. It provides 36 specialized tools through natural language keyword recognition, helping developers perform complex tasks intuitively. ### Core Values - **Natural Language**: Execute tools automatically through K

Unique: Integrates context compression with SQLite for efficient long-term storage and retrieval, unlike alternatives that may use simpler key-value stores.

vs others: More efficient in managing large contexts compared to traditional in-memory solutions.

6

ruvector-onnx-embeddings-wasmRepository37/100

via “embedding caching and memoization”

Portable WASM embedding generation with SIMD and parallel workers - run text embeddings in browsers, Cloudflare Workers, Deno, and Node.js

Unique: Implements two-tier caching strategy: fast in-memory LRU cache for hot embeddings, with overflow to IndexedDB for larger collections. Includes automatic cache warming from persisted storage on initialization, and cache coherency checks to detect model version mismatches.

vs others: More efficient than re-computing embeddings on every query, and simpler than external vector database setup (e.g., Pinecone) for small collections where in-memory caching is sufficient.

7

vectraRepository37/100

via “file-backed vector storage with in-memory indexing”

A lightweight, file-backed vector database for Node.js and browsers with Pinecone-compatible filtering and hybrid BM25 search.

Unique: Combines file-backed persistence with in-memory indexing, avoiding the complexity of running a separate database service while maintaining reasonable performance for small-to-medium datasets. Uses JSON serialization for human-readable storage and easy debugging.

vs others: Lighter weight than Pinecone or Weaviate for local development, but trades scalability and concurrent access for simplicity and zero infrastructure overhead.

8

teleton-agentAgent35/100

via “hybrid rag memory with sqlite-vec and fts5 fusion”

Teleton: Autonomous AI Agent for Telegram & TON Blockchain

Unique: Combines semantic search (sqlite-vec) with BM25 full-text search (FTS5) and fuses results via RRF, then applies AI-driven auto-compaction that summarizes old context rather than discarding it, preserving semantic information across long conversations

vs others: Pinecone or Weaviate require cloud infrastructure and API calls; Teleton's local sqlite-vec approach eliminates network latency and keeps all memory on-device, while RRF fusion outperforms single-index retrieval for mixed semantic/keyword queries

9

mcp-local-memoryMCP Server32/100

Lightweight local memory for your AI agent. SQLite + embeddings, zero setup, no services to run. Minimal config: ``` { "mcpServers": { "memory": { "command": "npx", "args": ["-y", "mcp-local-memory"] } } } ``` Your agent remembers preferences, project details, procedures --

Unique: Combines SQLite for persistent storage with embeddings for contextual retrieval, all in a zero-setup environment.

vs others: More user-friendly than traditional memory solutions because it requires no external services or complex configurations.

10

opencode-memSkill31/100

via “local-vector-database-management”

OpenCode plugin that gives coding agents persistent memory using local vector database

Unique: Provides embedded vector database functionality as an OpenCode plugin without requiring external services, using local file-based storage with built-in indexing and query optimization for coding agent memory

vs others: Eliminates network latency and external dependencies compared to cloud vector databases, but sacrifices scalability and multi-instance coordination for simplicity and privacy

11

Memory-PlusRepository31/100

via “semantic-memory-recording-with-vector-embedding”

** a lightweight, local RAG memory store to record, retrieve, update, delete, and visualize persistent "memories" across sessions—perfect for developers working with multiple AI coders (like Windsurf, Cursor, or Copilot) or anyone who wants their AI to actually remember them.

Unique: Integrates Google Gemini embeddings with Qdrant vector database through a dedicated MemoryProtocol class that handles text chunking, versioning, and category-based filtering — enabling semantic search with full memory history tracking rather than simple key-value storage

vs others: Lighter and more focused than full RAG frameworks (LlamaIndex, LangChain) by specializing in agent memory persistence with built-in MCP protocol support, avoiding framework overhead while maintaining semantic search capabilities

12

@sanity/embeddings-index-cliCLI Tool29/100

via “embeddings-index-storage-and-serialization”

CLI for creating and managing embeddings indexes

Unique: Stores embeddings alongside Sanity document metadata (IDs, URLs, field names) in a single index file, enabling direct integration with vector databases without separate metadata lookups

vs others: Self-contained index format reduces dependencies on external metadata stores, vs systems requiring separate document ID → embedding mappings

13

@hiveai/embeddingsRepository28/100

via “local semantic memory search with sentence embeddings”

hAIve embeddings — local sentence embeddings via Transformers.js for semantic memory search

Unique: Utilizes a fully local architecture for embedding generation and search, avoiding cloud dependencies and enhancing privacy.

vs others: More efficient and private than cloud-based embedding solutions, as it processes data locally without external API calls.

14

Loop GPTRepository25/100

via “semantic memory with embedding-based retrieval”

Re-implementation of AutoGPT as a Python package

Unique: Integrates embedding-based memory directly into the agent's prompt context, using pluggable embedding providers (OpenAI, open-source) for semantic retrieval without external vector databases. Differs from AutoGPT's simpler memory by enabling semantic search and from LangChain's memory abstractions by providing tighter agent integration.

vs others: Simpler than external RAG systems (no separate vector DB required) while providing semantic search capabilities; more integrated than LangChain's memory abstractions.

15

semantic-kernelFramework25/100

via “memory and embedding management with vector store abstraction”

Semantic Kernel Python SDK

Unique: Abstracts vector storage behind a unified memory interface with pluggable connectors, treating memory as a first-class kernel component rather than a separate system, enabling automatic context injection into semantic functions

vs others: More integrated than standalone vector databases because memory is tightly coupled with the kernel and semantic functions, enabling automatic context enrichment without explicit retrieval code in function definitions

16

@cr4yfish/entity-db-fixedRepository24/100

via “persistent vector storage with indexeddb backend”

EntityDB is an in-browser vector database wrapping indexedDB and Transformers.js

Unique: Wraps IndexedDB with a vector-aware schema that automatically indexes embeddings and provides similarity-based querying, bridging the gap between traditional key-value IndexedDB and specialized vector databases. Uses object stores with compound indexes for efficient entity + embedding lookups.

vs others: Lighter-weight than running a full vector database like Milvus or Qdrant in the browser, and requires no backend infrastructure unlike cloud-based solutions, though with lower query performance and storage limits.

17

HaystackProduct

via “embedding-generation-and-management”

Top Matches

Also Known As

Company