Qwen2.5 Technical Report
by Community
In this report, we introduce Qwen2.5, a comprehensive series of large language models (LLMs) designed to meet diverse needs. Compared to previous iterations, Qwen 2.5 has been si
OSS
Qwen2.5 Technical Report
Added 1 June 2026
Overview
The Qwen2.5 Technical Report details a series of large language models pre-trained on 18 trillion tokens, up from 7 trillion in prior versions, and refined through supervised fine-tuning with over 1 million samples. It documents improvements in common sense, expert knowledge, and reasoning capabilities achieved during both pre-training and post-training stages.
Best for
Best for
Researchers and developers evaluating large language model capabilities and training strategies
Use cases
- Assessing model performance and scalability for language tasks
- Comparing pre-training and post-training strategies across LLM families
- Guiding decisions on model selection for research or development projects
Notes
The Qwen2.5 Technical Report details a series of large language models pre-trained on 18 trillion tokens, up from 7 trillion in prior versions, and refined through supervised fine-tuning with over 1 million samples. It documents improvements in common sense, expert knowledge, and reasoning capabilities achieved during both pre-training and post-training stages.
Use cases
- Assessing model performance and scalability for language tasks
- Comparing pre-training and post-training strategies across LLM families
- Guiding decisions on model selection for research or development projects
Pros
- Provides extensive data on scaling from 7T to 18T tokens, showing clear improvements
- Covers both pre-training and post-training methodologies in detail
- Openly available as a community resource for benchmarking and education
Cons
- A technical report, not a deployable tool or framework
- Does not include inference benchmarks or deployment guidance
- Focuses on model architecture and training, not on practical usage or API access
Indexed from awesome-llm and enriched against its public facts.
Pros
- Provides extensive data on scaling from 7T to 18T tokens, showing clear improvements
- Covers both pre-training and post-training methodologies in detail
- Openly available as a community resource for benchmarking and education
Cons
- A technical report, not a deployable tool or framework
- Does not include inference benchmarks or deployment guidance
- Focuses on model architecture and training, not on practical usage or API access
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
vLLM
Community
A high-throughput and memory-efficient inference and serving engine for LLMs
ollama
Community
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
FastChat
Community
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.