Enterprise DNA
O Open Source Frameworks medium

DeepSpeed

by Community

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

D

OSS

DeepSpeed

Added 1 June 2026

#billion-parameters #compression #data-parallelism #deep-learning #gpu #inference #machine-learning #mixture-of-experts

Overview

DeepSpeed is a Python library for optimizing distributed training and inference of large language models and deep neural networks. It reduces memory footprint, accelerates training speed, and enables efficient multi-GPU and multi-node setups through techniques like gradient checkpointing, mixed precision, and ZeRO optimizer states partitioning.

Best for

Best for
Teams training large models who need to maximize GPU efficiency and scale across multiple devices.

Use cases

  • Training large models on limited GPU memory
  • Scaling training across multiple GPUs or nodes
  • Reducing inference latency for deployed models

Notes

DeepSpeed is a Python library for optimizing distributed training and inference of large language models and deep neural networks. It reduces memory footprint, accelerates training speed, and enables efficient multi-GPU and multi-node setups through techniques like gradient checkpointing, mixed precision, and ZeRO optimizer states partitioning.

42,436 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.

Use cases

  • Training large models on limited GPU memory
  • Scaling training across multiple GPUs or nodes
  • Reducing inference latency for deployed models

Pros

  • Significant memory savings enable training larger models on existing hardware
  • Production-ready with strong community adoption and Microsoft backing
  • Works with existing PyTorch code with minimal integration effort

Cons

  • Steep learning curve for advanced features like ZeRO stages and custom configurations
  • Debugging distributed training issues remains complex despite optimizations
  • Performance gains vary significantly based on hardware, model architecture, and tuning

Indexed from awesome-llm and enriched against its public facts.

Pros

  • Significant memory savings enable training larger models on existing hardware
  • Production-ready with strong community adoption and Microsoft backing
  • Works with existing PyTorch code with minimal integration effort

Cons

  • Steep learning curve for advanced features like ZeRO stages and custom configurations
  • Debugging distributed training issues remains complex despite optimizations
  • Performance gains vary significantly based on hardware, model architecture, and tuning

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Used by12entries
O OSS Obs medium

Accelerate

Community

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP a

★ 9,708 updated 2d ago
O OSS Framework medium

Axolotl

Community

Go ahead and axolotl questions

★ 11,997 updated 2d ago
O OSS Framework medium

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Community

BigScience

O OSS Orchestration medium

FlagAI

Community

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.

★ 3,874 updated 23d ago
O OSS Framework medium

GPT-NeoX

Community

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

★ 7,432 updated 15d ago
O OSS Framework medium

Megatron-DeepSpeed

Community

Ongoing research training transformer language models at scale, including: BERT & GPT-2

★ 2,252 updated 9mo ago
O OSS Framework medium

MPT-7B

Community

Introducing MPT-7B, the first entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available fo

O OSS Framework medium

OLMo: Accelerating the Science of Language Models

Community

Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have be

O OSS Framework medium

OLMoE: Open Mixture-of-Experts Language Models

Community

We introduce OLMoE, a fully open, state-of-the-art language model leveraging sparse Mixture-of-Experts (MoE). OLMoE-1B-7B has 7 billion (B) parameters but uses only 1B per input

O OSS Framework medium

Tune Studio

Community

Playground for devs to finetune & deploy LLMs

O OSS Framework medium

Using Deep and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model

Community

Megatron-Turing NLG

P Apps Productivity low

Unsloth

Various

Unsloth is an open-source, no-code web UI for training, running and exporting open models in one unified local interface.

Powers10entries
Pairs with14entries
O OSS Framework medium

Awesome-LLM-Compression

Community

Awesome LLM compression research papers and tools.

★ 1,840 updated 3mo ago
O OSS Framework medium

Awesome-LLM-Inference

Community

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism, etc. 🎉🎉

★ 16 updated 1y ago
O OSS Framework medium

Datatrove

Community

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

★ 3,076 updated 8d ago
O OSS Framework medium

Finetuned Language Models are Zero-Shot Learners

Community

This paper explores a simple method for improving the zero-shot learning abilities of language models. We show that instruction tuning—finetuning language models on a collection

O OSS Framework medium

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Community

2021-12

O OSS Framework medium

GLM-2|6|10|13|70B

Community

Org profile for THUDM on Hugging Face, the AI community building the future.

O OSS Framework medium

Large Language Model Training in 2023

Community

Learn about large language model training with insights on large language model examples, model architecture, and model training guide.

O OSS Framework medium

ModelEditingPapers

Community

Must-read Papers on Knowledge Editing for Large Language Models.

★ 1,230 updated 10mo ago
O OSS Framework medium

Scaling Instruction-Finetuned Language Models

Community

Flan-T5/PaLM

O OSS Framework medium

Scaling Laws for Neural Language Models

Community

Scaling Law

O OSS Framework medium

Training Compute-Optimal Large Language Models

Community

Chinchilla

O OSS Framework medium

Transformer Engine

Community

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide b

★ 3,374 updated 2d ago
O OSS Framework medium

Unifying Language Learning Paradigms

Community

Existing pre-trained models are generally geared towards a particular class of problems. To date, there seems to be still no consensus on what the right architecture and pre-trai

O OSS Framework medium

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Community

The performance of a large language model (LLM) depends heavily on the quality and size of its pretraining dataset. However, the pretraining datasets for state-of-the-art open LL

Alternatives9entries