English  |  正體中文  |  简体中文  |  Items with full text/Total items : 62805/95882 (66%)
Visitors : 3989282      Online Users : 618
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/111150


    Title: 不平衡資料集應用於問答系統答案驗證之研究
    Other Titles: A study on imbalanced dataset of answer validation for question answering system
    Authors: 蔡承家;Tsai, Cheng-Chia
    Contributors: 淡江大學資訊管理學系碩士班
    戴敏育
    Keywords: 機器學習;不平衡資料集;問答系統;支持向量機器;答案驗證;大學考試;QA-Lab;Machine learning;Imbalanced Dataset;Question Answering;Support Vector Machine;Answer Validation;university entrance examination
    Date: 2016
    Issue Date: 2017-08-24 23:45:15 (UTC+8)
    Abstract: 問答系統(Question answering)主要是在解決給定一道問題,透過機器閱讀(Machine Reading)的方式讓系統能夠理解這一道題目後進行回答。問答系統通常包含了問題分析(Question Analysis)、文件檢索(Document Retrieval)、答案抽取(Answer Extraction)、答案驗證(Answer Validation)。
      在過去文獻中有相當多的問答系統相關研究,但是並未對問答系統中答案驗證不平衡資料集與平衡資料集進行深入探討。本研究目的會透過機器學習完整分析不平衡資料集與平衡資料集。
      本研究使用 NTCIR-12 QA-Lab2 日本大學入學考試世界歷史資料集,此資料集與以往問答系統比較不同的地方在於是系統必須先理解一篇短文之後,才能夠回應接下來相關的問題。
      本研究針對不平衡資料集與平衡資料集提出了許多的模型,藉由最佳化參數與交叉驗證後,實驗結果顯示在不平衡資料集中,最佳模型的正確率達到了 90%。本論文主要貢獻為提出了一套問答系統,並且在答案驗證階段透過不平衡資料集與平衡資料集證實,不平衡資料集所建構出來之模型顯著性較高。
    Question Answering is a system that can process and answer a given question. Question Answering system usually consists of four stages: Question Analysis, Document Retrieval, Answer Extraction and Answer Validation.
    Although a considerable number of studies have been made on Question Answering system, little is known about the power of Imbalanced datasets and balanced datasets for Answer Validation from Question Answering.
    The purpose of this paper is to provide a comprehensive analysis of Imbalanced datasets and balanced datasets through machine learning.
    In this paper, we used datasets from NTCIR-12 QA-Lab2 Japanese university entrance exams on the subject of "World History". The difference between this datasets and previous ones lies in the different processing method that the system needed to understand a context provided by the present research’s datasets and answered the following related questions.
    The study presented many Imbalanced datasets and Balanced datasets models by using f.select and Cross Validation. The results show the best performance of our system achieved an accuracy of 90% in the Imbalanced datasets model.
    The main contribution of this study was in proposing a question answering system for Japanese university entrance exams and providing evidence that the Imbalanced datasets model outperformed the balanced datasets model for Answer Validation.
    Appears in Collections:[Graduate Institute & Department of Information Management] Thesis

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML128View/Open

    All items in 機構典藏 are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback