Skip to content

Latest commit

 

History

History
26 lines (21 loc) · 2.33 KB

Mintaka.md

File metadata and controls

26 lines (21 loc) · 2.33 KB

Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering

Minatak QA dataset contains 20K English question-answer pairs linked to Wikidata, and additional translated 160K questions in eight different languages. Mintaka contains only question and answer pairs, so that the authors used models that can be trained end-to- end. The table given below shows the evaluation results on 3 language models (rows # 1-3), 3 knowledge graph-based models (rows # 4-6), and 2 retriever-reader models(rows # 7 & 8). Train, dev, and test set can be found here.

Results of baseline models on Mintaka

Model Year hits@1 Language Reported By
XL-T5 (fine-tuned) 2022 0.38 English Sen et al., 2022
DPR (trained) 2022 0.31 English Sen et al., 2022
T5 2022 0.28 English Sen et al., 2022
T5 for CBQA (zero-shot) 2022 0.20 English Sen et al., 2022
Rigel 2022 0.20 English Sen et al., 2022
EmbedKGQA 2022 0.18 English Sen et al., 2022
DPR (zero-shot) 2022 0.15 English Sen et al., 2022
KVMemNet 2022 0.12 English Sen et al., 2022
T5 for CBQA (translated) 2022 0.31 Multilingually Sen et al., 2022
Rigel 2022 0.19 Multilingually Sen et al., 2022
MT5 2022 0.16 Multilingually Sen et al., 2022

Reference

Sen, P., Aji, A. F., & Saffari, A. (2022). Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering. Proceedings of the 29th International Conference on Computational Linguistics, 1604–1619. https://aclanthology.org/2022.coling-1.138

Go back to README