femtoGPT
by Community
Pure Rust implementation of a minimal Generative Pretrained Transformer
OSS
femtoGPT
Added 1 June 2026
Overview
femtoGPT is a minimal Generative Pretrained Transformer implemented entirely in Rust. It provides a lightweight, dependency-light framework for training and running small GPT-style language models.
Best for
Best for
Developers and researchers who want a minimal, understandable GPT implementation in Rust for learning or small-scale experimentation.
Use cases
- Experimenting with transformer architectures in a low-level Rust environment
- Building small-scale language models for embedded or resource-constrained systems
- Learning the internals of GPT models through a clean, minimal codebase
Notes
femtoGPT is a minimal Generative Pretrained Transformer implemented entirely in Rust. It provides a lightweight, dependency-light framework for training and running small GPT-style language models.
934 stars on GitHub. Last updated 2025-10-21. Licensed MIT.
Use cases
- Experimenting with transformer architectures in a low-level Rust environment
- Building small-scale language models for embedded or resource-constrained systems
- Learning the internals of GPT models through a clean, minimal codebase
Pros
- Pure Rust with minimal dependencies, making it easy to compile and integrate
- Small and readable codebase ideal for educational exploration
- Active community with nearly 1,000 GitHub stars
Cons
- Not designed for production-scale models or large datasets
- Limited documentation and examples beyond the repository itself
- Lacks advanced features like distributed training or GPU acceleration
Indexed from awesome-llm and enriched against its public facts.
Pros
- Pure Rust with minimal dependencies, making it easy to compile and integrate
- Small and readable codebase ideal for educational exploration
- Active community with nearly 1,000 GitHub stars
Cons
- Not designed for production-scale models or large datasets
- Limited documentation and examples beyond the repository itself
- Lacks advanced features like distributed training or GPU acceleration
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.