O Open Source Frameworks medium

ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

by Community

Microsoft

Visit Community View repo Submit your build →

OSS

Added 1 June 2026

Overview

ZeRO is a memory optimization technique for distributed training of large deep learning models. It reduces the memory footprint of model states (optimizer, gradients, parameters) by partitioning them across data-parallel processes, enabling training of models with trillions of parameters on existing hardware.

Best for

Best for
Researchers and engineers training very large models on distributed GPU clusters

Use cases

Training large language models with billions of parameters
Fine-tuning massive pretrained models on limited GPU memory
Scaling distributed training across many GPUs efficiently

Notes

Use cases

Training large language models with billions of parameters
Fine-tuning massive pretrained models on limited GPU memory
Scaling distributed training across many GPUs efficiently

Pros

Dramatically reduces per-device memory usage for model states
Enables training of models that would otherwise exceed GPU memory
Compatible with existing data-parallel training frameworks

Cons

Requires careful tuning of partitioning stages (ZeRO-1, 2, 3)
Increased communication overhead can impact training throughput
Not a standalone tool; must be integrated into a training framework like DeepSpeed or PyTorch

Indexed from awesome-llm and enriched against its public facts.

Pros

Dramatically reduces per-device memory usage for model states
Enables training of models that would otherwise exceed GPU memory
Compatible with existing data-parallel training frameworks

Cons

Requires careful tuning of partitioning stages (ZeRO-1, 2, 3)
Increased communication overhead can impact training throughput
Not a standalone tool; must be integrated into a training framework like DeepSpeed or PyTorch

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Uses1entry

O OSS Obs medium

PyTorch

Community

Tensors and Dynamic neural networks in Python with strong GPU acceleration

★ 100,318 updated 23d ago

Built with1entry

O OSS Framework medium

DeepSpeed

Community

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

★ 42,436 updated 23d ago

Pairs with2entries

O OSS Framework medium

Megatron-LM

Community

Ongoing research training transformer models at scale

★ 16,545 updated 23d ago

O OSS Framework medium

NeMo Framework

Community

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech

★ 17,285 updated 23d ago

Alternative to1entry

O OSS Framework medium

Colossal-AI

Community

Making large AI models cheaper, faster and more accessible

★ 41,382 updated 30d ago

← Back to Open Source Submit your own entry →