nanotron
by Community
Minimalistic large language model 3D-parallelism training
OSS
nanotron
Added 1 June 2026
Overview
Nanotron is a minimalistic framework for training large language models using 3D parallelism. It implements data, tensor, and pipeline parallelism in Python to distribute training across multiple GPUs.
Best for
Best for
Researchers and engineers who need a simple, hackable framework for distributed LLM training experiments.
Use cases
- Training large language models from scratch with distributed parallelism
- Experimenting with 3D parallelism strategies for model scaling
- Reproducing research results in distributed LLM training
Notes
Nanotron is a minimalistic framework for training large language models using 3D parallelism. It implements data, tensor, and pipeline parallelism in Python to distribute training across multiple GPUs.
2,705 stars on GitHub. Last updated 2026-05-26. Licensed Apache-2.0.
Use cases
- Training large language models from scratch with distributed parallelism
- Experimenting with 3D parallelism strategies for model scaling
- Reproducing research results in distributed LLM training
Pros
- Lightweight and focused on core parallelism techniques
- Active community with 2705 GitHub stars
- Integrates well with the Hugging Face ecosystem
Cons
- Limited to training, no inference or deployment features
- Minimal documentation beyond code comments
- Requires deep understanding of distributed training concepts
Indexed from awesome-llm and enriched against its public facts.
Pros
- Lightweight and focused on core parallelism techniques
- Active community with 2705 GitHub stars
- Integrates well with the Hugging Face ecosystem
Cons
- Limited to training, no inference or deployment features
- Minimal documentation beyond code comments
- Requires deep understanding of distributed training concepts
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
DeepSpeed
Community
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Colossal-AI
Community
Making large AI models cheaper, faster and more accessible
Megatron-LM
Community
Ongoing research training transformer models at scale