English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 56826/90592 (63%)
造访人次 : 12115235      在线人数 : 71
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/111150


    题名: 不平衡資料集應用於問答系統答案驗證之研究
    其它题名: A study on imbalanced dataset of answer validation for question answering system
    作者: 蔡承家;Tsai, Cheng-Chia
    贡献者: 淡江大學資訊管理學系碩士班
    戴敏育
    关键词: 機器學習;不平衡資料集;問答系統;支持向量機器;答案驗證;大學考試;QA-Lab;Machine learning;Imbalanced Dataset;Question Answering;Support Vector Machine;Answer Validation;university entrance examination
    日期: 2016
    上传时间: 2017-08-24 23:45:15 (UTC+8)
    摘要: 問答系統(Question answering)主要是在解決給定一道問題,透過機器閱讀(Machine Reading)的方式讓系統能夠理解這一道題目後進行回答。問答系統通常包含了問題分析(Question Analysis)、文件檢索(Document Retrieval)、答案抽取(Answer Extraction)、答案驗證(Answer Validation)。
      在過去文獻中有相當多的問答系統相關研究,但是並未對問答系統中答案驗證不平衡資料集與平衡資料集進行深入探討。本研究目的會透過機器學習完整分析不平衡資料集與平衡資料集。
      本研究使用 NTCIR-12 QA-Lab2 日本大學入學考試世界歷史資料集,此資料集與以往問答系統比較不同的地方在於是系統必須先理解一篇短文之後,才能夠回應接下來相關的問題。
      本研究針對不平衡資料集與平衡資料集提出了許多的模型,藉由最佳化參數與交叉驗證後,實驗結果顯示在不平衡資料集中,最佳模型的正確率達到了 90%。本論文主要貢獻為提出了一套問答系統,並且在答案驗證階段透過不平衡資料集與平衡資料集證實,不平衡資料集所建構出來之模型顯著性較高。
    Question Answering is a system that can process and answer a given question. Question Answering system usually consists of four stages: Question Analysis, Document Retrieval, Answer Extraction and Answer Validation.
    Although a considerable number of studies have been made on Question Answering system, little is known about the power of Imbalanced datasets and balanced datasets for Answer Validation from Question Answering.
    The purpose of this paper is to provide a comprehensive analysis of Imbalanced datasets and balanced datasets through machine learning.
    In this paper, we used datasets from NTCIR-12 QA-Lab2 Japanese university entrance exams on the subject of "World History". The difference between this datasets and previous ones lies in the different processing method that the system needed to understand a context provided by the present research’s datasets and answered the following related questions.
    The study presented many Imbalanced datasets and Balanced datasets models by using f.select and Cross Validation. The results show the best performance of our system achieved an accuracy of 90% in the Imbalanced datasets model.
    The main contribution of this study was in proposing a question answering system for Japanese university entrance exams and providing evidence that the Imbalanced datasets model outperformed the balanced datasets model for Answer Validation.
    显示于类别:[資訊管理學系暨研究所] 學位論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML54检视/开启

    在機構典藏中所有的数据项都受到原著作权保护.

    TAIR相关文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回馈