Enterprise DNA
O Open Source Frameworks medium

promptfoo

by Community

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative config

P

OSS

promptfoo

Added 1 June 2026

#ci #ci-cd #cicd #evaluation #evaluation-framework #llm #llm-eval #llm-evaluation

Overview

promptfoo is a testing framework for evaluating prompts, agents, and RAG systems across multiple LLM providers including GPT, Claude, Gemini, and DeepSeek. It runs comparative benchmarks, red team tests, and vulnerability scans using declarative YAML configs with CLI and CI/CD support.

Best for

Best for
Teams building LLM applications who need systematic prompt validation and security testing before deployment

Use cases

  • Compare prompt performance across different LLM models before production
  • Automate security testing and adversarial input scanning for AI applications
  • Integrate prompt evaluation into CI/CD pipelines for continuous quality checks

Notes

promptfoo is a testing framework for evaluating prompts, agents, and RAG systems across multiple LLM providers including GPT, Claude, Gemini, and DeepSeek. It runs comparative benchmarks, red team tests, and vulnerability scans using declarative YAML configs with CLI and CI/CD support.

21,784 stars on GitHub. Last updated 2026-06-01. Licensed MIT.

Use cases

  • Compare prompt performance across different LLM models before production
  • Automate security testing and adversarial input scanning for AI applications
  • Integrate prompt evaluation into CI/CD pipelines for continuous quality checks

Pros

  • Multi-model comparison built in, reducing vendor lock-in risk
  • Red teaming and vulnerability scanning included, not bolted on
  • Declarative config approach makes tests reproducible and version-controllable

Cons

  • Requires familiarity with YAML config syntax and CLI tooling
  • Testing scope limited to prompt and agent behavior, not full application integration
  • Costs scale with API calls to external LLM providers during test runs

Indexed from awesome-llm and enriched against its public facts.

Pros

  • Multi-model comparison built in, reducing vendor lock-in risk
  • Red teaming and vulnerability scanning included, not bolted on
  • Declarative config approach makes tests reproducible and version-controllable

Cons

  • Requires familiarity with YAML config syntax and CLI tooling
  • Testing scope limited to prompt and agent behavior, not full application integration
  • Costs scale with API calls to external LLM providers during test runs

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Pairs with9entries
O OSS Framework medium

Arthur Shield

Community

Open-source toolkit for building, testing, and monitoring AI agents. Version prompts, run experiments, trace workflows, and catch issues before users do.

O OSS Framework medium

Awesome ChatGPT Prompts

Community

f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

★ 163,161 updated 2d ago
O OSS Framework medium

awesome-hallucination-detection

Community

List of papers on hallucination detection in LLMs.

★ 1,096 updated 9d ago
O OSS Framework medium

Awesome LLM Security

Community

A curation of awesome tools, documents and projects about LLM Security.

★ 1,599 updated 9mo ago
O OSS Framework medium

Chinese Large Model Leaderboard

Community

非线智能 NoneLinear - ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括374个大模型,覆盖chatgpt、gpt-5.4、谷歌gemini-3.1-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3.6-max、qwen3.6-plus、百川、讯飞星火、商汤senseChat等商用模型, 以及st

★ 6,103 updated 4d ago
O OSS Framework high

DSPy

Stanford NLP

Programming, not prompting. Declare what you want, compile prompts and weights against an objective.

O OSS Framework medium

Giskard

Community

🐢 Open-Source Evaluation & Testing library for LLM Agents

★ 5,414 updated 5d ago
O OSS Framework medium

Prompt Engineering

Community

Prompt Engineering, also known as In-Context Prompting, refers to methods for how to communicate with LLM to steer its behavior for desired outcomes without updating the model we

O OSS Framework medium

Ragas

Community

Supercharge Your LLM Application Evaluations 🚀

★ 14,186 updated 3mo ago
Alternatives9entries