LightGBM
by Community
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other
OSS
LightGBM
Added 1 June 2026
Overview
LightGBM is a gradient boosting framework written in C++ that trains decision tree ensembles for classification, regression, and ranking tasks. It uses leaf-wise tree growth and histogram-based learning to achieve fast training on large datasets with lower memory overhead than traditional gradient boosting.
Best for
Best for
Data scientists building production ML systems on large tabular datasets where training speed and memory efficiency matter.
Use cases
- Training classification models on tabular data at scale
- Building ranking systems for search and recommendation
- Rapid prototyping of gradient boosting pipelines
Notes
LightGBM is a gradient boosting framework written in C++ that trains decision tree ensembles for classification, regression, and ranking tasks. It uses leaf-wise tree growth and histogram-based learning to achieve fast training on large datasets with lower memory overhead than traditional gradient boosting.
18,416 stars on GitHub. Last updated 2026-06-01. Licensed MIT.
Use cases
- Training classification models on tabular data at scale
- Building ranking systems for search and recommendation
- Rapid prototyping of gradient boosting pipelines
Pros
- Significantly faster training speed than XGBoost on large datasets
- Lower memory consumption through histogram-based learning
- Supports distributed training across multiple machines
Cons
- Leaf-wise growth can overfit on small datasets without careful tuning
- Steeper learning curve for hyperparameter optimization compared to simpler models
- Less mature ecosystem and fewer pre-built integrations than XGBoost
Indexed from awesome-llmops and enriched against its public facts.
Pros
- Significantly faster training speed than XGBoost on large datasets
- Lower memory consumption through histogram-based learning
- Supports distributed training across multiple machines
Cons
- Leaf-wise growth can overfit on small datasets without careful tuning
- Steeper learning curve for hyperparameter optimization compared to simpler models
- Less mature ecosystem and fewer pre-built integrations than XGBoost
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
EvalML
Community
EvalML is an AutoML library written in python.
FLAML
Community
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
TPOT
Community
The Tree-Based Pipeline Optimization Tool (TPOT) was one of the very first AutoML methods and open-source software packages developed for the data science community. TPOT was dev
automl-gs
Community
Provide an input CSV and a target field to predict, generate a model + code to run it.
Deepchecks
Community
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test
Feast
Community
The Open Source Feature Store for AI/ML
hyperunity
Community
A toolset for black-box hyperparameter optimisation.
Upgini
Community
Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, in