O Open Source Observability medium

Great Expectations

by Community

Always know what to expect from your data.

Visit Community View repo Submit your build →

OSS

Great Expectations

Added 1 June 2026

#cleandata #data-engineering #data-profilers #data-profiling #data-quality #data-science #data-unit-tests #datacleaner

Overview

Great Expectations is an open source Python library for data quality validation. It lets you define expectations about your data, run automated checks against datasets, and generate human readable documentation of data quality.

Best for

Best for
Data engineers and analysts who need a rigorous, open source way to validate data quality and documentation.

Use cases

Validate incoming data pipelines against predefined quality rules
Generate data documentation and quality reports automatically
Monitor data drift in production by comparing expectations over time

Notes

11,532 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.

Use cases

Validate incoming data pipelines against predefined quality rules
Generate data documentation and quality reports automatically
Monitor data drift in production by comparing expectations over time

Pros

Well documented with a large community (over 11,500 GitHub stars)
Declarative API makes it easy to define and version control data expectations
Integrates with common data tools like Pandas, Spark, and SQL databases

Cons

Steep learning curve for users new to data quality concepts
Performance can slow down on very large datasets without careful tuning
Expectation definitions require consistent maintenance as data schemas evolve

Indexed from awesome-llmops and enriched against its public facts.

Pros

Well documented with a large community (over 11,500 GitHub stars)
Declarative API makes it easy to define and version control data expectations
Integrates with common data tools like Pandas, Spark, and SQL databases

Cons

Steep learning curve for users new to data quality concepts
Performance can slow down on very large datasets without careful tuning
Expectation definitions require consistent maintenance as data schemas evolve

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Pairs with4entries

O OSS Obs medium

Prefect

Community

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

★ 22,518 updated 1mo ago

O OSS Obs medium

DVC

Community

🦉 Data Versioning and ML Experiments

★ 15,643 updated 1mo ago

O OSS Obs medium

Dolt

Community

Dolt – Git for Data

★ 22,967 updated 1mo ago

O OSS Obs medium

Kubeflow

Community

Machine Learning Toolkit for Kubernetes

★ 15,700 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →