O Open Source Frameworks medium

FELM

by Community

FELM: Benchmarking Factuality Evaluation of Large Language Models

Visit Community View repo Submit your build →

OSS

FELM

Added 1 June 2026

Overview

FELM is a benchmark for evaluating how factually accurate large language models are. It provides a standardized dataset and methodology to measure factuality across different models and tasks.

Best for

Best for
Researchers and developers needing a standardized way to measure LLM factuality

Use cases

Assessing factual accuracy of LLM outputs in research
Comparing factuality performance across multiple models
Validating model improvements in truthfulness

Notes

FELM is a benchmark for evaluating how factually accurate large language models are. It provides a standardized dataset and methodology to measure factuality across different models and tasks.

Use cases

Assessing factual accuracy of LLM outputs in research
Comparing factuality performance across multiple models
Validating model improvements in truthfulness

Pros

Provides a structured, reproducible evaluation framework
Focuses specifically on factuality, a critical quality metric
Community-driven benchmark with transparent methodology

Cons

Limited to the specific tasks and datasets in the benchmark
May not cover all real-world factuality challenges
Requires familiarity with benchmarking tools and setup

Indexed from awesome-llm and enriched against its public facts.

Pros

Provides a structured, reproducible evaluation framework
Focuses specifically on factuality, a critical quality metric
Community-driven benchmark with transparent methodology

Cons

Limited to the specific tasks and datasets in the benchmark
May not cover all real-world factuality challenges
Requires familiarity with benchmarking tools and setup

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Pairs with1entry

O OSS Framework medium

lm-evaluation-harness

Community

A framework for few-shot evaluation of language models.

★ 12,772 updated 2mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →