淡江大學機構典藏:Item 987654321/105702
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 62797/95867 (66%)
Visitors : 3742040      Online Users : 542
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/105702


    Title: 針對單一領域中文意見探勘系統之研究與實作
    Other Titles: Research and implementation of single domain Chinese Opinion Mining System
    Authors: 林憲嘉;Lin, Hsien-Chia
    Contributors: 淡江大學資訊工程學系碩士在職專班
    蔣璿東
    Keywords: 意見探勘;中文意見探勘;單一領域;Opinion Mining;Chinese opinion mining;Single Domain
    Date: 2015
    Issue Date: 2016-01-22 15:03:04 (UTC+8)
    Abstract:   由於現今行動設備的進步及網路的普及,也因此越來越多人習慣在網路上發表各種評價及參與各類討論,在這些網路言論中往往隱含了很多有價值的資訊,若能有效蒐集這些資訊將有助於相關人員做出有效決策,然而透過人工蒐集資料與分析需要消耗相當大的人力與時間,因此發展意見探勘系統能有效改善此問題。
      我們可將中文意見探勘系統分為爬文、分析、及報表三大部份,在設計系統時我們會遇到幾個問題如:1.在爬文時若過度頻繁送出請求取得文章,可能導致對方主機暫時拒絕提供服務。2.網路上不斷地有新興用語誕生,在此狀況下要建立完整的詞庫變得相當有挑戰性。3.每個領域都有屬於自己的特殊用語需要處理。4.由於單一領域使用的詞庫有限,如何找到有用的詞彙,來縮減詞彙的數量,以達到效能的提升。5.因分析的結果必須符合一般的句型用法,因此,必須考慮到詞彙間的對應關係問題,使分析的結果具有可讀性。6.由於主題討論目標明確,回文者慣性省略這些詞彙進行發文,導致意見難以回收。
      本研究將針對上述問題進行後續討論及提出相關解決辦法,並實作一為單一領域設計之中文意見探勘系統。
      More and more people share their discussions on the Internet due to the spreading of internet and mobile devices. There is always much valuable information among the discussions. And if we can collect the information effectively, it will help researchers to make more efficient decisions. Developing Opinion Mining System can resolve this problem more efficiently for collecting and analyzing data by people consume too much human resources and time.
      We can divide Opinion Mining System into three parts: researching the data on the Internet, analyzing and reporting statistics. And we encounter some problems in designing the system:1. The server on the Internet might decline to provide services while sending too many requests in the process of researching the data. 2. It’s a big challenge to establish a full word database for there are new words and phrases pop up all the time on the Internet. 3. There are specific sentiment words in each domain and all of them have to be processed. 4. There is limitation for the word database of each domain. Therefore, it’s necessary to find useful words and reduce the amount of words to raise the efficiency. 5. The results of analyses should fit for normal sentence usages. Therefore, it’s necessary to consider the context dependent among words to enhance the readability of the results. 6. People who reply on the Internet usually omit the words in their posting because the topics for discussions are always clear. Therefore, the opinions are hard to collect.
      Our research focuses on the above-mentioned problems for further discussions and providing ways of solution, and also implements a Chinese Opinion Mining System for a single domain.
    Appears in Collections:[Graduate Institute & Department of Computer Science and Information Engineering] Thesis

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML144View/Open

    All items in 機構典藏 are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback