O Open Source Frameworks medium

Megatron-DeepSpeed

by Community

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Visit Community View repo Submit your build →

OSS

Megatron-DeepSpeed

Added 1 June 2026

Overview

Open-source framework for training large transformer models like BERT and GPT-2 at scale. Combines model parallelism and ZeRO optimizations to handle distributed training across multiple GPUs. Primarily used for ongoing research on scaling transformer language models.

Best for

Best for
Researchers and engineers training large-scale transformer models in distributed environments

Use cases

Training large transformer language models from scratch
Distributed training across multiple GPU nodes
Research into scaling behaviors and model parallelism

Notes

2,252 stars on GitHub. Last updated 2025-08-14.

Use cases

Training large transformer language models from scratch
Distributed training across multiple GPU nodes
Research into scaling behaviors and model parallelism

Pros

Efficient model parallelism and ZeRO integration for large-scale training
Proven in research environments for models like BERT and GPT-2
Active community with ongoing development

Cons

Complex setup and configuration compared to simpler frameworks
Requires substantial hardware resources and expertise
Documentation can be sparse or research-oriented

Indexed from awesome-llm and enriched against its public facts.

Pros

Efficient model parallelism and ZeRO integration for large-scale training
Proven in research environments for models like BERT and GPT-2
Active community with ongoing development

Cons

Complex setup and configuration compared to simpler frameworks
Requires substantial hardware resources and expertise
Documentation can be sparse or research-oriented

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Built with1entry

O OSS Obs medium

PyTorch

Community

Tensors and Dynamic neural networks in Python with strong GPU acceleration

★ 100,318 updated 1mo ago

Pairs with1entry

O OSS Framework medium

vLLM

Community

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 81,619 updated 1mo ago

Alternative to1entry

O OSS Framework medium

Colossal-AI

Community

Making large AI models cheaper, faster and more accessible

★ 41,382 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →