Awesome-LLM-Compression
by Community
Awesome LLM compression research papers and tools.
OSS
Awesome-LLM-Compression
Added 1 June 2026
Overview
Awesome-LLM-Compression is a community-maintained curated list of research papers and tools focused on compressing large language models. It organizes resources by techniques like pruning, quantization, knowledge distillation, and parameter sharing for easy reference.
Best for
Best for
Researchers and engineers exploring LLM compression for efficient deployment
Use cases
- Discovering state-of-the-art LLM compression methods and benchmarks
- Comparing different compression techniques for model deployment
- Staying updated on recent academic work and open-source tools
Notes
Awesome-LLM-Compression is a community-maintained curated list of research papers and tools focused on compressing large language models. It organizes resources by techniques like pruning, quantization, knowledge distillation, and parameter sharing for easy reference.
1,840 stars on GitHub. Last updated 2026-02-23. Licensed MIT.
Use cases
- Discovering state-of-the-art LLM compression methods and benchmarks
- Comparing different compression techniques for model deployment
- Staying updated on recent academic work and open-source tools
Pros
- Comprehensive collection of papers and tools in one place
- High community visibility with 1840 GitHub stars
- Free and open source, continuously updated
Cons
- Not a tool itself; requires manual evaluation of listed resources
- No built-in code or implementation for immediate use
- Quality of linked tools may vary without curation beyond listing
Indexed from awesome-llm and enriched against its public facts.
Pros
- Comprehensive collection of papers and tools in one place
- High community visibility with 1840 GitHub stars
- Free and open source, continuously updated
Cons
- Not a tool itself; requires manual evaluation of listed resources
- No built-in code or implementation for immediate use
- Quality of linked tools may vary without curation beyond listing
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
TensorRT-LLM
Community
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NV
llama.cpp
Community
LLM inference in C/C++
vLLM
Community
A high-throughput and memory-efficient inference and serving engine for LLMs
DeepSpeed
Community
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
unslothai
Community
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.