TAT-DQA
by Community
TAT-DQA: A Document Visual Question Answering (VQA) Dataset, aiming to answer questions over visually-rich documents with a hybrid of Tabular and Textual Content in Finance
OSS
TAT-DQA
Added 1 June 2026
Overview
TAT-DQA is a document visual question answering dataset designed for answering questions over visually-rich financial documents. It combines tabular and textual content to benchmark models in understanding hybrid document layouts.
Best for
Best for
Researchers and developers building document understanding systems for financial reports and invoices
Use cases
- Training and evaluating document VQA models on financial reports
- Extracting structured answers from tables and text in invoices or filings
- Benchmarking multimodal understanding of scanned or digital documents
Notes
TAT-DQA is a document visual question answering dataset designed for answering questions over visually-rich financial documents. It combines tabular and textual content to benchmark models in understanding hybrid document layouts.
Use cases
- Training and evaluating document VQA models on financial reports
- Extracting structured answers from tables and text in invoices or filings
- Benchmarking multimodal understanding of scanned or digital documents
Pros
- Specialized for finance with real-world tabular and textual content
- Community-maintained, enabling open research and reproducibility
Cons
- Domain-specific to finance, limiting generalizability to other document types
- Requires domain knowledge for effective use and interpretation of results
Indexed from awesome-llm and enriched against its public facts.
Pros
- Specialized for finance with real-world tabular and textual content
- Community-maintained, enabling open research and reproducibility
Cons
- Domain-specific to finance, limiting generalizability to other document types
- Requires domain knowledge for effective use and interpretation of results
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.