Question Answering is a system that can process and answer a given question. In recent years, an enormous number of studies have been made on question answering; little is known about the effects of imbalanced datasets with answer validation of question answer system. The objective of this paper is to provide a better understanding of the effects of imbalanced datasets model for answer validation in a real world university entrance exam question answering system. In this paper, we proposed a question answer system and provided a comprehensive analysis of imbalanced datasets and balanced datasets model with Answer Validation of Question Answering system using NTCIR-12 QA-Lab2 Japanese university entrance exams English translation development and test dataset. As a result, our system achieved 90% accuracy with imbalanced datasets machine learning model for the NTCIR-12 QA-Lab2 development datasets.
Proceedings of the 2016 IEEE 17th International Conference on Information Reuse and Integration (IEEE IRI 2016)