淡江大學機構典藏:Item 987654321/111150
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 62819/95882 (66%)
造訪人次 : 4007602      線上人數 : 585
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/111150


    題名: 不平衡資料集應用於問答系統答案驗證之研究
    其他題名: A study on imbalanced dataset of answer validation for question answering system
    作者: 蔡承家;Tsai, Cheng-Chia
    貢獻者: 淡江大學資訊管理學系碩士班
    戴敏育
    關鍵詞: 機器學習;不平衡資料集;問答系統;支持向量機器;答案驗證;大學考試;QA-Lab;Machine learning;Imbalanced Dataset;Question Answering;Support Vector Machine;Answer Validation;university entrance examination
    日期: 2016
    上傳時間: 2017-08-24 23:45:15 (UTC+8)
    摘要: 問答系統(Question answering)主要是在解決給定一道問題,透過機器閱讀(Machine Reading)的方式讓系統能夠理解這一道題目後進行回答。問答系統通常包含了問題分析(Question Analysis)、文件檢索(Document Retrieval)、答案抽取(Answer Extraction)、答案驗證(Answer Validation)。
      在過去文獻中有相當多的問答系統相關研究,但是並未對問答系統中答案驗證不平衡資料集與平衡資料集進行深入探討。本研究目的會透過機器學習完整分析不平衡資料集與平衡資料集。
      本研究使用 NTCIR-12 QA-Lab2 日本大學入學考試世界歷史資料集,此資料集與以往問答系統比較不同的地方在於是系統必須先理解一篇短文之後,才能夠回應接下來相關的問題。
      本研究針對不平衡資料集與平衡資料集提出了許多的模型,藉由最佳化參數與交叉驗證後,實驗結果顯示在不平衡資料集中,最佳模型的正確率達到了 90%。本論文主要貢獻為提出了一套問答系統,並且在答案驗證階段透過不平衡資料集與平衡資料集證實,不平衡資料集所建構出來之模型顯著性較高。
    Question Answering is a system that can process and answer a given question. Question Answering system usually consists of four stages: Question Analysis, Document Retrieval, Answer Extraction and Answer Validation.
    Although a considerable number of studies have been made on Question Answering system, little is known about the power of Imbalanced datasets and balanced datasets for Answer Validation from Question Answering.
    The purpose of this paper is to provide a comprehensive analysis of Imbalanced datasets and balanced datasets through machine learning.
    In this paper, we used datasets from NTCIR-12 QA-Lab2 Japanese university entrance exams on the subject of "World History". The difference between this datasets and previous ones lies in the different processing method that the system needed to understand a context provided by the present research’s datasets and answered the following related questions.
    The study presented many Imbalanced datasets and Balanced datasets models by using f.select and Cross Validation. The results show the best performance of our system achieved an accuracy of 90% in the Imbalanced datasets model.
    The main contribution of this study was in proposing a question answering system for Japanese university entrance exams and providing evidence that the Imbalanced datasets model outperformed the balanced datasets model for Answer Validation.
    顯示於類別:[資訊管理學系暨研究所] 學位論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML128檢視/開啟

    在機構典藏中所有的資料項目都受到原著作權保護.

    TAIR相關文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回饋