O Open Source Frameworks medium

Chinese Large Model Leaderboard

by Community

非线智能 NoneLinear - ReLE评测：中文AI大模型能力评测（持续更新）：目前已囊括374个大模型，覆盖chatgpt、gpt-5.4、谷歌gemini-3.1-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3.6-max、qwen3.6-plus、百川、讯飞星火、商汤senseChat等商用模型，以及st

Visit Community View repo Submit your build →

OSS

Added 1 June 2026

#agentic-ai #artificial-intelligence #llm-agent #llm-evaluation

Overview

A community-maintained benchmark for Chinese large language models, covering 374 commercial and open-source models including GPT, Gemini, Claude, ERNIE, Qwen, and others. It provides a continuously updated leaderboard and a defect library with over 2 million entries for analysis and improvement.

Best for

Best for
Developers and researchers evaluating Chinese large language models.

Use cases

Compare performance of Chinese LLMs across multiple models
Identify common defects and weaknesses in large language models
Track benchmark trends and model improvements over time

Notes

6,103 stars on GitHub. Last updated 2026-05-30.

Use cases

Compare performance of Chinese LLMs across multiple models
Identify common defects and weaknesses in large language models
Track benchmark trends and model improvements over time

Pros

Covers a wide range of both proprietary and open-source Chinese LLMs
Includes a large defect library for deeper analysis
Regularly updated with community contributions

Cons

Focused on Chinese language models, limiting global applicability
Evaluation methodology is community-driven, not formally peer-reviewed
Interface and documentation are primarily in Chinese

Indexed from awesome-llm and enriched against its public facts.

Pros

Covers a wide range of both proprietary and open-source Chinese LLMs
Includes a large defect library for deeper analysis
Regularly updated with community contributions

Cons

Focused on Chinese language models, limiting global applicability
Evaluation methodology is community-driven, not formally peer-reviewed
Interface and documentation are primarily in Chinese

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Alternative to2entries

O OSS Framework medium

OpenAI Evals

Community

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

★ 18,584 updated 3mo ago

O OSS Framework medium

lm-evaluation-harness

Community

A framework for few-shot evaluation of language models.

★ 12,772 updated 2mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →