淡江大學機構典藏:Item 987654321/101613
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 64191/96979 (66%)
Visitors : 8165105      Online Users : 7653
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/101613


    Title: 應用機器學習與多辭典的中英雙語意見分析之研究
    Other Titles: A study of applying machine learning with multi-dictionary for bilingual opinion analysis
    Authors: 謝衫蒂;Hsieh, Shan-Ti
    Contributors: 淡江大學資訊管理學系碩士在職專班
    戴敏育;Day, Min-Yuh
    Keywords: 意見分析;意見探勘;情感分析;情緒辭典;機器學習;Opinion Analysis;Opinion Mining;Sentiment analysis;Sentiment Dictionary;Machine learning
    Date: 2014
    Issue Date: 2015-05-01 16:12:09 (UTC+8)
    Abstract:   意見分析目的為利用自然語言處理的理論及運算技術,了解網路上意見文本、語句中所蘊含的主觀傾向。在中文評論裡,偶爾會出現中英文交替使用的現象。然而,以往研究中較少有同時針對不同語言共存的問題提出相關整合方法。

      本研究提出一中英雙語意見分析的方法,設計一中英雙語意見辭典,衡量各意見辭典與使用語法特徵,並且利用機器學習進行分類,最後運用特徵選取的方法得到最佳化的特徵集合。

      實驗結果顯示,意見辭典的搭配選擇會影響分類效果,使用雙語意見分析的方法於中文語料庫中時,在最佳化特徵集合後,使用21個特徵值於機器學習的整體正確率可達到交叉驗證74.98%與開放測試77.10%。除此之外,本論文亦針對英文資料在中文語料庫中的比例進行探討,結果顯示英文資料的比例越高,中英雙語意見分析的方法影響力越高。

      本論文主要貢獻為提出美妝保養專有領域意見詞、比較不同意見辭典之搭配的效果,以及證實雙語意見傾向之評估具有輔助機器學習的效果。
    Opinion Analysis is a task that aims to determine the subjective orientation in contexts of expressing opinions on the Internet using computational techniques of Natural Language Processing. Posting opinions on the Internet that use bilingual expression is an occasional case in Chinese reviews. However, very little attention has been given to bilingual expression of opinion analysis in prior research.

    This paper proposes an approach, which focuses on bilingual opinion analysis applying multi-dictionary, machine learning and feature selection in the contexts of bilingual opinion in Chinese reviews.

    We found that accuracy would be strongly affected by different sets of general sentiment dictionaries. Our optimal experiment results showed that the overall performance by using 21 features of our proposed system achieved 74.98% with accuracy of cross validation and 77.10% with accuracy of open test. In addition to the experimental results, we also discovered the influential trend of our system by the variation of proportion of English data in Chinese reviews.

    The contributions of this paper are threefold: (1) extracting a new Chinese sentiment dictionary in the field of cosmetic reviews from our experiment, (2) comparing the influences in different sentiment dictionaries, and (3) proving that bilingual opinion analysis can facilitate the performance of machine learning in Chinese reviews.
    Appears in Collections:[Graduate Institute & Department of Information Management] Thesis

    Files in This Item:

    File SizeFormat
    index.html0KbHTML330View/Open

    All items in 機構典藏 are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback