awesome-hallucination-detection
by Community
List of papers on hallucination detection in LLMs.
OSS
awesome-hallucination-detection
Added 1 June 2026
Overview
A community-curated list of papers and resources focused on hallucination detection in large language models. It organizes research by categories such as detection methods, benchmarks, and surveys, providing a structured reference for builders and researchers.
Best for
Best for
Researchers and developers who need a curated bibliography on hallucination detection
Use cases
- Exploring state-of-the-art hallucination detection techniques
- Finding benchmark datasets for evaluating model hallucinations
- Surveying recent academic literature on LLM reliability
Notes
A community-curated list of papers and resources focused on hallucination detection in large language models. It organizes research by categories such as detection methods, benchmarks, and surveys, providing a structured reference for builders and researchers.
1,096 stars on GitHub. Last updated 2026-05-25. Licensed Apache-2.0.
Use cases
- Exploring state-of-the-art hallucination detection techniques
- Finding benchmark datasets for evaluating model hallucinations
- Surveying recent academic literature on LLM reliability
Pros
- Comprehensive coverage of research papers across detection approaches
- Regularly updated with contributions from the community
- Well-organized into categories for quick reference
Cons
- Not a runnable tool or library; no code implementations included
- Requires manual paper reading and evaluation
- May lack practical guidance for real-world deployment
Indexed from awesome-llm and enriched against its public facts.
Pros
- Comprehensive coverage of research papers across detection approaches
- Regularly updated with contributions from the community
- Well-organized into categories for quick reference
Cons
- Not a runnable tool or library; no code implementations included
- Requires manual paper reading and evaluation
- May lack practical guidance for real-world deployment
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
lm-evaluation-harness
Community
A framework for few-shot evaluation of language models.
Ragas
Community
Supercharge Your LLM Application Evaluations 🚀
OpenAI Evals
Community
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
promptfoo
Community
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative config