O Open Source Frameworks medium

GPT-NeoX

by Community

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Visit Community View repo Submit your build →

OSS

GPT-NeoX

Added 1 June 2026

#deepspeed-library #gpt-3 #language-model #transformers

Overview

GPT-NeoX is a framework for training large-scale autoregressive transformer models. It implements model parallelism across GPUs using Megatron and DeepSpeed libraries. Built by EleutherAI, it is designed for researchers to train GPT-like models at scale.

Best for

Best for
Researchers and engineers training custom large language models

Use cases

Training large language models from scratch
Experimenting with model parallelism techniques
Fine-tuning autoregressive transformers on custom datasets

Notes

7,432 stars on GitHub. Last updated 2026-05-19. Licensed Apache-2.0.

Use cases

Training large language models from scratch
Experimenting with model parallelism techniques
Fine-tuning autoregressive transformers on custom datasets

Pros

Enables training of very large models (tens of billions of parameters)
Leverages proven Megatron and DeepSpeed optimizations
Open source with strong community support (over 7,000 stars)

Cons

Requires substantial GPU compute infrastructure
Primarily suited for autoregressive models only
Less polished than commercial offerings; may require deep engineering expertise

Indexed from awesome-llm and enriched against its public facts.

Pros

Enables training of very large models (tens of billions of parameters)
Leverages proven Megatron and DeepSpeed optimizations
Open source with strong community support (over 7,000 stars)

Cons

Requires substantial GPU compute infrastructure
Primarily suited for autoregressive models only
Less polished than commercial offerings; may require deep engineering expertise

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Built with3entries

O OSS Obs medium

PyTorch

Community

Tensors and Dynamic neural networks in Python with strong GPU acceleration

★ 100,318 updated 1mo ago

O OSS Framework medium

Megatron-LM

Community

Ongoing research training transformer models at scale

★ 16,545 updated 1mo ago

O OSS Framework medium

DeepSpeed

Community

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

★ 42,436 updated 1mo ago

Alternative to2entries

O OSS Framework medium

Colossal-AI

Community

Making large AI models cheaper, faster and more accessible

★ 41,382 updated 1mo ago

O OSS Framework medium

NeMo Framework

Community

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech

★ 17,285 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →