Grok-1-314B-MoE
by Community
Grok-1-314B-MoE — indexed from awesome-llm
OSS
Grok-1-314B-MoE
Added 1 June 2026
Overview
Grok-1-314B-MoE is an open-source 314-billion-parameter mixture-of-experts model released by xAI. It operates as a decoder-only transformer with eight expert subsets per token, using two active experts per forward pass. The model provides weights and architecture for community deployment and research.
Best for
Best for
Researchers and teams with high-end hardware who need an extremely large open-source language model
Use cases
- Deploy the 314B MoE model for large-scale text generation tasks
- Experiment with mixture-of-experts architectures and routing mechanisms
- Run inference on high-memory GPU clusters for language understanding
Notes
Grok-1-314B-MoE is an open-source 314-billion-parameter mixture-of-experts model released by xAI. It operates as a decoder-only transformer with eight expert subsets per token, using two active experts per forward pass. The model provides weights and architecture for community deployment and research.
Use cases
- Deploy the 314B MoE model for large-scale text generation tasks
- Experiment with mixture-of-experts architectures and routing mechanisms
- Run inference on high-memory GPU clusters for language understanding
Pros
- Massive 314B parameter capacity with MoE efficiency for reduced compute per token
- Open-source weights enable full community access and modification
- Based on xAI’s production model, offering strong baseline performance
Cons
- Inference requires high-end GPU clusters with substantial memory (e.g., 8x H100 or more)
- No official fine-tuning pipeline or training scripts provided
- Community maintenance may lead to slower updates and limited documentation
Indexed from awesome-llm and enriched against its public facts.
Pros
- Massive 314B parameter capacity with MoE efficiency for reduced compute per token
- Open-source weights enable full community access and modification
- Based on xAI's production model, offering strong baseline performance
Cons
- Inference requires high-end GPU clusters with substantial memory (e.g., 8x H100 or more)
- No official fine-tuning pipeline or training scripts provided
- Community maintenance may lead to slower updates and limited documentation
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
vLLM
Community
A high-throughput and memory-efficient inference and serving engine for LLMs
llama.cpp
Community
LLM inference in C/C++
ollama
Community
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.