Auto-evaluator
by Community
Evaluation tool for LLM QA chains
OSS
Auto-evaluator
Added 1 June 2026
Overview
Auto-evaluator is an open-source Python tool for evaluating LLM-based question-answering chains. It automates the assessment of response quality, helping developers verify accuracy and consistency in QA systems.
Best for
Best for
Developers building and testing custom LLM-based QA pipelines
Use cases
- Automatically scoring QA outputs for correctness
- Comparing performance of different LLM configurations
- Identifying response failures and edge cases
Notes
Auto-evaluator is an open-source Python tool for evaluating LLM-based question-answering chains. It automates the assessment of response quality, helping developers verify accuracy and consistency in QA systems.
1,091 stars on GitHub. Last updated 2023-05-10.
Use cases
- Automatically scoring QA outputs for correctness
- Comparing performance of different LLM configurations
- Identifying response failures and edge cases
Pros
- High community adoption with over 1000 stars
- Simple Python integration for existing pipelines
- Open source with transparent evaluation logic
Cons
- Requires setup and configuration for custom use
- Evaluation quality depends on the chosen judge model
- Limited to QA chains, not general LLM workflows
Indexed from awesome-langchain and enriched against its public facts.
Pros
- High community adoption with over 1000 stars
- Simple Python integration for existing pipelines
- Open source with transparent evaluation logic
Cons
- Requires setup and configuration for custom use
- Evaluation quality depends on the chosen judge model
- Limited to QA chains, not general LLM workflows
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.