Chroma
by Community
Search infrastructure for AI
OSS
Chroma
Added 1 June 2026
Overview
Chroma is an open-source vector database written in Rust that stores and retrieves embeddings for AI applications. It provides search infrastructure for semantic similarity queries, enabling developers to build retrieval-augmented generation (RAG) systems and vector-based search features without managing complex infrastructure.
Best for
Best for
Developers building RAG systems and semantic search features who want a straightforward, open-source vector store
Use cases
- Building RAG pipelines that retrieve relevant documents for LLM context
- Implementing semantic search across unstructured text or image embeddings
- Storing and querying high-dimensional vectors from embedding models
Notes
Chroma is an open-source vector database written in Rust that stores and retrieves embeddings for AI applications. It provides search infrastructure for semantic similarity queries, enabling developers to build retrieval-augmented generation (RAG) systems and vector-based search features without managing complex infrastructure.
28,173 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.
Use cases
- Building RAG pipelines that retrieve relevant documents for LLM context
- Implementing semantic search across unstructured text or image embeddings
- Storing and querying high-dimensional vectors from embedding models
Pros
- Open-source with active community support (28k+ GitHub stars)
- Lightweight and easy to integrate into Python applications
- Handles embedding storage and similarity search out of the box
Cons
- Rust backend may require additional deployment considerations for some teams
- Limited to vector operations, does not handle traditional relational queries
- Scaling to very large datasets may require external infrastructure decisions
Indexed from awesome-llmops and enriched against its public facts.
Pros
- Open-source with active community support (28k+ GitHub stars)
- Lightweight and easy to integrate into Python applications
- Handles embedding storage and similarity search out of the box
Cons
- Rust backend may require additional deployment considerations for some teams
- Limited to vector operations, does not handle traditional relational queries
- Scaling to very large datasets may require external infrastructure decisions
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
Milvus
Community
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
Qdrant
Community
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Weaviate
Community
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance an
pgvector
Community
Open-source vector similarity search for Postgres
chroma-core/chroma-mcp
Various
A Model Context Protocol (MCP) server implementation that provides database capabilities for Chroma
gogabrielordonez/mcp-ragchat
Various
MCP server that adds RAG-powered AI chat to any website. One command from Claude Code. Local vector store, multi-provider LLM (OpenAI/Anthropic/Gemini). Zero cloud dependency.
shinpr/mcp-local-rag
Various
Local-first RAG server for developers. Semantic + keyword search for code and technical docs. Works with MCP or CLI. Fully private, zero setup.
ChatFiles
Community
Document Chatbot — multiple files. Powered by GPT / Embedding.
create-t3-turbo-ai
Community
Build full-stack, type-safe, LLM-powered apps with the T3 Stack, Turborepo, OpenAI, and Langchain
Embedchain
Community
Universal memory layer for AI Agents
Haystack
Community
Create agentic, context engineered AI systems using Haystack’s modular and customizable building blocks, built for real-world, production-ready applications.
Improving language models by retrieving from trillions of tokens
Community
Publications — Google DeepMind
Knowledge GPT
Community
Accurate answers and instant citations for your documents.
LangChain
Community
The agent engineering platform.
Literal AI
Community
Multi-modal LLM observability and evaluation platform. Create prompt templates, deploy prompts versions, debug LLM runs, create datasets, run evaluations, monitor LLM metrics and c
Llama2 Embedding Server
Community
A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.
mem0
mem0
Memory layer for AI apps. Personalisation, continuity, and recall as a service.
MemFree
Community
MemFree - Hybrid AI Search Engine & AI Page Generator
MindSQL
Community
MindSQL: A Python Text-to-SQL RAG Library simplifying database interactions. Seamlessly integrates with PostgreSQL, MySQL, SQLite, Snowflake, and BigQuery. Powered by GPT-4 and Lla
Private GPT
Community
Interact with your documents using the power of GPT, 100% privately, no data leaks
RagTune
Community
EXPLAIN ANALYZE for RAG retrieval — inspect, debug, benchmark, and tune your retrieval layer
Semantic Cache Router
Community
Distributed semantic cache and stateful routing system that cuts LLM API costs by returning cached responses for semantically similar queries. Uses ANN vector search (cosine ≥ 0.8)
Awesome RAG Production
Various
A curated list of battle-tested tools, frameworks, and best practices for building scalable, production-grade Retrieval-Augmented Generation (RAG) systems.
DataLine
Various
Chat with your data - AI-driven data analysis and visualization tool.
privateGPT
Various
Interact with your documents using the power of GPT, 100% privately, no data leaks
AgentsMesh
Community
The AI Agent Workforce Platform — where teams scale beyond headcount. Give every team member an AI agent squad.
Airweave
Community
Open-source context retrieval layer for AI agents
AquilaDB
Community
An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
Cheshire Cat
Community
AI agent microservice
Clip-as-a-service
Community
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
currentslab/awesome-vector-search
Community
Collections of vector search related libraries, service and research papers
deeplake
Community
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.
Flowise
Community
Build AI Agents, Visually
Infinity
Community
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
LlamaIndex
LlamaIndex
The data framework for LLM apps. RAG, ingestion, structured extraction, agents over your data.
LLMApp
Community
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, re
Llmware
Community
Unified framework for building enterprise RAG pipelines with small, specialized models
Manag.ai
Community
Your all-in-one prompt management and observability platform. Craft, track, and perfect your LLM prompts with ease.
OpenLIT
Community
Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Management, Vault, Playground. 🚀💻 Integrates with
Quiver
Community
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama.
Search with Lepton
Community
Building a quick conversation-based search demo with Lepton AI.
semantic-coverage
Community
Automated detection of knowledge gaps and blind spots in RAG vector stores.
Semantic Kernel
Microsoft
Microsoft's enterprise-flavoured framework for AI agents. .NET-first, with Python and Java siblings.
Swiss Army Llama
Community
A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.
talkd.ai dialog
Community
RAG LLM Ops App for easy deployment and testing
AquilaDB
Community
An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
Awadb
Community
AI Native database for embedding vectors
deeplake
Community
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.
Lancedb
Community
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
Marqo
Community
Ecommerce Search and Discovery - marqo.ai
Milvus
Community
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
pgvector
Community
Open-source vector similarity search for Postgres
Pinecone
Community
Search through billions of items for similar matches to any object, in milliseconds. It’s the next generation of search, an API call away.
Qdrant
Community
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Rivestack
Community
Managed pgvector on dedicated PostgreSQL with NVMe storage. 2,000 QPS at sub-4ms p50, from $35/month, migration help from Supabase, Neon, Pinecone, and self-hosted.
Statewave
Community
Open-source memory runtime for AI agents — reproducible, provenance-tagged context bundles instead of query-time retrieval. Apache-2.0, self-hosted on Postgres + pgvector, Python +
VectorDB
Community
A Python vector database you just need - no more, no less.
Weaviate
Community
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance an