Enterprise DNA
O Open Source Observability medium

CodeGeeX

by Community

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

C

OSS

CodeGeeX

Added 1 June 2026

#code-generation #pretrained-models #tools

Overview

CodeGeeX is an open-source multilingual code generation model introduced at KDD 2023. It generates code in multiple programming languages based on natural language descriptions or partial code inputs. The model is trained on a large corpus of code and text, and can be run locally or integrated into development workflows.

Best for

Best for
Developers who want a free, self-hosted code assistant for multi-language projects.

Use cases

  • Autocompleting code snippets during development
  • Generating boilerplate or repetitive code from comments
  • Translating natural language descriptions into executable code

Notes

CodeGeeX is an open-source multilingual code generation model introduced at KDD 2023. It generates code in multiple programming languages based on natural language descriptions or partial code inputs. The model is trained on a large corpus of code and text, and can be run locally or integrated into development workflows.

8,791 stars on GitHub. Last updated 2024-08-13. Licensed Apache-2.0.

Use cases

  • Autocompleting code snippets during development
  • Generating boilerplate or repetitive code from comments
  • Translating natural language descriptions into executable code

Pros

  • Open source and free to use, self-hostable for privacy
  • Supports multiple programming languages beyond Python
  • Backed by academic publication and active community with 8.8k GitHub stars

Cons

  • May require significant local compute resources to run efficiently
  • Code quality can be inconsistent compared to larger proprietary models
  • Documentation and deployment guides are limited to the repository

Indexed from awesome-llmops and enriched against its public facts.

Pros

  • Open source and free to use, self-hostable for privacy
  • Supports multiple programming languages beyond Python
  • Backed by academic publication and active community with 8.8k GitHub stars

Cons

  • May require significant local compute resources to run efficiently
  • Code quality can be inconsistent compared to larger proprietary models
  • Documentation and deployment guides are limited to the repository