Enterprise DNA
O Open Source Orchestration medium

GPTCache

by Community

Semantic cache for LLMs. Fully integrated with LangChain and llamaindex.

G

OSS

GPTCache

Added 1 June 2026

#aigc #autogpt #babyagi #chatbot #chatgpt #chatgpt-api #dolly #gpt

Overview

GPTCache is a semantic cache for LLMs that stores and retrieves responses based on query similarity. It integrates directly with LangChain and LlamaIndex to reduce latency and API costs on repeated or similar requests.

Best for

Best for
Developers using LLMs in production who need to cache repetitive queries to cut costs and latency

Use cases

  • Caching common user queries in chatbots to speed up responses
  • Reducing OpenAI API costs by serving cached answers for similar prompts
  • Offloading repeated LLM calls in RAG pipelines for faster retrieval

Notes

GPTCache is a semantic cache for LLMs that stores and retrieves responses based on query similarity. It integrates directly with LangChain and LlamaIndex to reduce latency and API costs on repeated or similar requests.

8,048 stars on GitHub. Last updated 2025-07-11. Licensed MIT.

Use cases

  • Caching common user queries in chatbots to speed up responses
  • Reducing OpenAI API costs by serving cached answers for similar prompts
  • Offloading repeated LLM calls in RAG pipelines for faster retrieval

Pros

  • Reduces latency and API costs by serving cached responses
  • Seamless integration with LangChain and LlamaIndex
  • Open source with active community and 8000+ GitHub stars

Cons

  • Requires tuning similarity thresholds to avoid false positives or misses
  • Adds storage and embedding computation overhead
  • Less effective for highly variable or unique prompts

Indexed from awesome-langchain and enriched against its public facts.

Pros

  • Reduces latency and API costs by serving cached responses
  • Seamless integration with LangChain and LlamaIndex
  • Open source with active community and 8000+ GitHub stars

Cons

  • Requires tuning similarity thresholds to avoid false positives or misses
  • Adds storage and embedding computation overhead
  • Less effective for highly variable or unique prompts

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.