A Question Answering System for Situation Puzzle with SPQA

Tian Zhao

doi:10.54097/fcis.v4i2.10203

Authors

Tian Zhao

DOI:

https://doi.org/10.54097/fcis.v4i2.10203

Keywords:

NLP, Question Answering, Situation Puzzle, UNIFIEDQA, UNIFIEDQA-V2, SPQA

Abstract

There are many questions answering (QA) system built for solving QA tasks. In 2020 and 2022, Allen Institute and the University of Washington proposed UnifiedQA and UnifiedQA-v2. Their core concept is that the semantic understanding and reasoning capabilities required by models are common, and may not require format specific models although the QA task forms are different. Behind this concept, I build a new QA model named SPQA, aiming to answer the situation puzzle questions by adding new situation-puzzle related dataset (SpQ). In addition, I evaluate the performance of SPQA and UnifiedQA-v2 for fine-tuning and prompt-tuning. The results of fine-tuning indicate that SpQ dataset is important for fine-tuning and prompt-tuning to answer situation puzzle questions well, but also make the answering ability of normal yes/no questions worse. Eventually, the results of prompt-tuning indicate that the effects of SpQ is larger and more significant on situation puzzle questions and normal yes/no questions under the same data scale. In the future work, the further research like building larger SpQ dataset should be considered.

Downloads

Download data is not yet available.

References

Jed Hartman, Rec.puzzles archive 27 Aug 1998 http://www. kith.org/logos/things/sitpuz/lateral.html.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

McCann, B., Keskar, N. S., Xiong, C., & Socher, R. (2018). The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730.

Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018). Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5505-5514).

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Radford, A., & Salimans, T. (2018). GPT: Improving Language Understanding by Generative Pre-Training. arXiv.

Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., ... & Zettlemoyer, L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.

Zhang, Y., Sun, S., Galley, M., Chen, Y. C., Brockett, C., Gao, X., ... & Dolan, B. (2019). Dialogpt: Large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536.

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140), 1-67.

Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2021). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586.

Khashabi, D., Min, S., Khot, T., Sabharwal, A., Tafjord, O., Clark, P., & Hajishirzi, H. (2020). Unifiedqa: Crossing format boundaries with a single qa system. arXiv preprint arXiv: 2005. 00700.

Khashabi, D., Kordi, Y., & Hajishirzi, H. (2022). Unifiedqa-v2: Stronger generalization via broader cross-format training. arXiv preprint arXiv:2202.12359.

Budzianowski, P., & Vulić, I. (2019). Hello, it's GPT-2--how can I help you? towards the use of pretrained language models for task-oriented dialogue systems. arXiv preprint arXiv: 1907. 05774.

Floridi, L., & Chiriatti, M. (2020). GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 30(4), 681-694.

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. arXiv preprint arXiv: 2203.02155.

OpenAI. (2023). GPT-4 Technical Report. .https:// openai. com/ product/gpt-4.

Lester, B., Al-Rfou, R., & Constant, N. (2021). The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691.