Hamilton
by Community
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere pytho
OSS
Hamilton
Added 1 June 2026
Overview
Hamilton is an open-source framework for defining dataflows as Python functions. It automatically tracks lineage, generates documentation, and enables unit testing of data transformations. The library runs anywhere Python does, from local scripts to distributed clusters.
Best for
Best for
Data scientists and engineers who need testable, documented dataflows with automatic lineage tracking.
Use cases
- Building modular, testable data pipelines for analytics or ML
- Automatically generating data lineage and metadata for compliance
- Refactoring monolithic notebooks into maintainable, documented code
Notes
Hamilton is an open-source framework for defining dataflows as Python functions. It automatically tracks lineage, generates documentation, and enables unit testing of data transformations. The library runs anywhere Python does, from local scripts to distributed clusters.
2,504 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.
Use cases
- Building modular, testable data pipelines for analytics or ML
- Automatically generating data lineage and metadata for compliance
- Refactoring monolithic notebooks into maintainable, documented code
Pros
- Enforces modular, self-documenting code through function-based definitions
- Built-in lineage and tracing without extra instrumentation
- Scales from local development to production environments
Cons
- Requires adopting a function-oriented paradigm, which may not suit all workflows
- Limited to Python ecosystems, not language-agnostic
- Community-driven project with no official enterprise support
Indexed from awesome-llmops and enriched against its public facts.
Pros
- Enforces modular, self-documenting code through function-based definitions
- Built-in lineage and tracing without extra instrumentation
- Scales from local development to production environments
Cons
- Requires adopting a function-oriented paradigm, which may not suit all workflows
- Limited to Python ecosystems, not language-agnostic
- Community-driven project with no official enterprise support
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.