Enterprise DNA
O Open Source Frameworks medium

TAT-DQA

by Community

TAT-DQA: A Document Visual Question Answering (VQA) Dataset, aiming to answer questions over visually-rich documents with a hybrid of Tabular and Textual Content in Finance

T

OSS

TAT-DQA

Added 1 June 2026

Overview

TAT-DQA is a document visual question answering dataset designed for answering questions over visually-rich financial documents. It combines tabular and textual content to benchmark models in understanding hybrid document layouts.

Best for

Best for
Researchers and developers building document understanding systems for financial reports and invoices

Use cases

  • Training and evaluating document VQA models on financial reports
  • Extracting structured answers from tables and text in invoices or filings
  • Benchmarking multimodal understanding of scanned or digital documents

Notes

TAT-DQA is a document visual question answering dataset designed for answering questions over visually-rich financial documents. It combines tabular and textual content to benchmark models in understanding hybrid document layouts.

Use cases

  • Training and evaluating document VQA models on financial reports
  • Extracting structured answers from tables and text in invoices or filings
  • Benchmarking multimodal understanding of scanned or digital documents

Pros

  • Specialized for finance with real-world tabular and textual content
  • Community-maintained, enabling open research and reproducibility

Cons

  • Domain-specific to finance, limiting generalizability to other document types
  • Requires domain knowledge for effective use and interpretation of results

Indexed from awesome-llm and enriched against its public facts.

Pros

  • Specialized for finance with real-world tabular and textual content
  • Community-maintained, enabling open research and reproducibility

Cons

  • Domain-specific to finance, limiting generalizability to other document types
  • Requires domain knowledge for effective use and interpretation of results