O Open Source Frameworks medium

Ragas

by Community

Supercharge Your LLM Application Evaluations 🚀

Visit Community View repo Submit your build →

OSS

Ragas

Added 1 June 2026

#evaluation #llm #llmops

Overview

Ragas is a Python framework for evaluating LLM applications through automated metrics and test generation. It measures retrieval quality, generation accuracy, and end-to-end performance without requiring manual ground truth labels. Designed for RAG systems and LLM pipelines, it provides quantitative feedback on application behavior.

Best for

Best for
Teams building RAG systems who need continuous evaluation without manual labeling

Use cases

Measuring retrieval quality in RAG systems
Benchmarking LLM output accuracy and relevance
Automated test generation for prompt chains

Notes

14,186 stars on GitHub. Last updated 2026-02-24. Licensed Apache-2.0.

Use cases

Measuring retrieval quality in RAG systems
Benchmarking LLM output accuracy and relevance
Automated test generation for prompt chains

Pros

Reduces evaluation overhead by automating metric computation
Works without pre-built ground truth datasets
Active open source community with 14k+ stars

Cons

Metrics depend on LLM quality, introducing circular dependencies
Python-only, requires integration into existing workflows
Automated metrics may not capture domain-specific correctness

Indexed from awesome-llm and enriched against its public facts.

Pros

Reduces evaluation overhead by automating metric computation
Works without pre-built ground truth datasets
Active open source community with 14k+ stars

Cons

Metrics depend on LLM quality, introducing circular dependencies
Python-only, requires integration into existing workflows
Automated metrics may not capture domain-specific correctness

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Uses1entry

O OSS Framework medium

LangChain

Community

The agent engineering platform.

★ 138,234 updated 1mo ago

Alternative to1entry

O OSS Framework medium

OpenAI Evals

Community

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

★ 18,584 updated 3mo ago

Used by1entry

O OSS Framework medium

LangChain

Community

The agent engineering platform.

★ 138,234 updated 1mo ago

Pairs with6entries

O OSS Framework medium

AutoRAG

Community

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

★ 4,802 updated 1mo ago

O OSS Framework medium

awesome-hallucination-detection

Community

List of papers on hallucination detection in LLMs.

★ 1,096 updated 1mo ago

O OSS Framework medium

Awesome-LLM-hallucination

Community

LLM hallucination paper list

★ 335 updated 2y ago

O OSS Framework medium

Awesome LLM Security

Community

A curation of awesome tools, documents and projects about LLM Security.

★ 1,599 updated 11mo ago

O OSS Framework medium

LawBench

Community

LawBench

O OSS Framework medium

promptfoo

Community

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative config

★ 21,784 updated 1mo ago

Alternatives2entries

O OSS Framework medium

Evidently

Community

Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.

★ 7,561 updated 2mo ago

O OSS Framework medium

Giskard

Community

🐢 Open-Source Evaluation & Testing library for LLM Agents

★ 5,414 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →