Enterprise DNA
O Open Source Frameworks medium

BLOOMZ&mT0

by Community

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

B

OSS

BLOOMZ&mT0

Added 1 June 2026

Overview

BLOOMZ and mT0 are open-source multilingual language models from the BigScience community. BLOOMZ is fine-tuned from BLOOM on a multitask mixture covering 46 languages, while mT0 is fine-tuned from mT5 on the same tasks. Both enable zero-shot cross-lingual generalization by learning from English instruction data and applying to unseen languages.

Best for

Best for
Researchers and developers building multilingual NLP tools for languages with scarce labeled data

Use cases

  • Zero-shot text classification in languages without labeled training data
  • Few-shot natural language inference across multiple languages
  • Multilingual question answering without task-specific fine-tuning

Notes

BLOOMZ and mT0 are open-source multilingual language models from the BigScience community. BLOOMZ is fine-tuned from BLOOM on a multitask mixture covering 46 languages, while mT0 is fine-tuned from mT5 on the same tasks. Both enable zero-shot cross-lingual generalization by learning from English instruction data and applying to unseen languages.

Use cases

  • Zero-shot text classification in languages without labeled training data
  • Few-shot natural language inference across multiple languages
  • Multilingual question answering without task-specific fine-tuning

Pros

  • Open-source and community-driven with transparent training process
  • Supports 46 languages, including low-resource ones
  • Strong zero-shot performance on unseen languages and tasks

Cons

  • Large model sizes (up to 176B parameters) require substantial compute
  • Performance varies significantly between high-resource and low-resource languages
  • Limited documentation and longer inference latency compared to smaller models

Indexed from awesome-llm and enriched against its public facts.

Pros

  • Open-source and community-driven with transparent training process
  • Supports 46 languages, including low-resource ones
  • Strong zero-shot performance on unseen languages and tasks

Cons

  • Large model sizes (up to 176B parameters) require substantial compute
  • Performance varies significantly between high-resource and low-resource languages
  • Limited documentation and longer inference latency compared to smaller models