Enterprise DNA
O Open Source Frameworks medium

nanotron

by Community

Minimalistic large language model 3D-parallelism training

N

OSS

nanotron

Added 1 June 2026

Overview

Nanotron is a minimalistic framework for training large language models using 3D parallelism. It implements data, tensor, and pipeline parallelism in Python to distribute training across multiple GPUs.

Best for

Best for
Researchers and engineers who need a simple, hackable framework for distributed LLM training experiments.

Use cases

  • Training large language models from scratch with distributed parallelism
  • Experimenting with 3D parallelism strategies for model scaling
  • Reproducing research results in distributed LLM training

Notes

Nanotron is a minimalistic framework for training large language models using 3D parallelism. It implements data, tensor, and pipeline parallelism in Python to distribute training across multiple GPUs.

2,705 stars on GitHub. Last updated 2026-05-26. Licensed Apache-2.0.

Use cases

  • Training large language models from scratch with distributed parallelism
  • Experimenting with 3D parallelism strategies for model scaling
  • Reproducing research results in distributed LLM training

Pros

  • Lightweight and focused on core parallelism techniques
  • Active community with 2705 GitHub stars
  • Integrates well with the Hugging Face ecosystem

Cons

  • Limited to training, no inference or deployment features
  • Minimal documentation beyond code comments
  • Requires deep understanding of distributed training concepts

Indexed from awesome-llm and enriched against its public facts.

Pros

  • Lightweight and focused on core parallelism techniques
  • Active community with 2705 GitHub stars
  • Integrates well with the Hugging Face ecosystem

Cons

  • Limited to training, no inference or deployment features
  • Minimal documentation beyond code comments
  • Requires deep understanding of distributed training concepts