O Open Source Frameworks medium

MMToM-QA

by Community

Leaderboard for the MMToM-QA benchmark (Jin et al., ACL 2024).

Visit Community View repo Submit your build →

OSS

MMToM-QA

Added 1 June 2026

Overview

A community-maintained leaderboard for the MMToM-QA benchmark introduced by Jin et al. at ACL 2024. It tracks and compares model performance on a multimodal question answering task.

Best for

Best for
Researchers and practitioners evaluating multimodal QA models on theory-of-mind reasoning tasks

Use cases

Benchmarking multimodal QA models against a standardized evaluation
Comparing model performance on theory-of-mind related questions
Tracking progress and reproducibility across research efforts

Notes

A community-maintained leaderboard for the MMToM-QA benchmark introduced by Jin et al. at ACL 2024. It tracks and compares model performance on a multimodal question answering task.

Use cases

Benchmarking multimodal QA models against a standardized evaluation
Comparing model performance on theory-of-mind related questions
Tracking progress and reproducibility across research efforts

Pros

Provides a centralized and transparent evaluation for the benchmark
Open and community-driven, allowing easy contributions
Directly tied to an ACL 2024 publication for referenced methodology

Cons

Limited to the specific MMToM-QA benchmark and its task definition
May lack active maintenance or frequent updates from the community
Not a general-purpose framework beyond the leaderboard function

Indexed from awesome-llm and enriched against its public facts.

Pros

Provides a centralized and transparent evaluation for the benchmark
Open and community-driven, allowing easy contributions
Directly tied to an ACL 2024 publication for referenced methodology

Cons

Limited to the specific MMToM-QA benchmark and its task definition
May lack active maintenance or frequent updates from the community
Not a general-purpose framework beyond the leaderboard function

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Built with1entry

O OSS Obs medium

PyTorch

Community

Tensors and Dynamic neural networks in Python with strong GPU acceleration

★ 100,318 updated 1mo ago

Pairs with1entry

O OSS Framework medium

lm-evaluation-harness

Community

A framework for few-shot evaluation of language models.

★ 12,772 updated 2mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →