針對單一領域中文意見探勘系統之研究與實作

淡江大學機構典藏 > 工學院 > 資訊工程學系暨研究所 > 學位論文 > Item 987654321/105702

請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/105702

題名:	針對單一領域中文意見探勘系統之研究與實作
其他題名:	Research and implementation of single domain Chinese Opinion Mining System
作者:	林憲嘉;Lin, Hsien-Chia
貢獻者:	淡江大學資訊工程學系碩士在職專班蔣璿東
關鍵詞:	意見探勘;中文意見探勘;單一領域;Opinion Mining;Chinese opinion mining;Single Domain
日期:	2015
上傳時間:	2016-01-22 15:03:04 (UTC+8)
摘要:	由於現今行動設備的進步及網路的普及，也因此越來越多人習慣在網路上發表各種評價及參與各類討論，在這些網路言論中往往隱含了很多有價值的資訊，若能有效蒐集這些資訊將有助於相關人員做出有效決策，然而透過人工蒐集資料與分析需要消耗相當大的人力與時間，因此發展意見探勘系統能有效改善此問題。　　我們可將中文意見探勘系統分為爬文、分析、及報表三大部份，在設計系統時我們會遇到幾個問題如：1.在爬文時若過度頻繁送出請求取得文章，可能導致對方主機暫時拒絕提供服務。2.網路上不斷地有新興用語誕生，在此狀況下要建立完整的詞庫變得相當有挑戰性。3.每個領域都有屬於自己的特殊用語需要處理。4.由於單一領域使用的詞庫有限，如何找到有用的詞彙，來縮減詞彙的數量，以達到效能的提升。5.因分析的結果必須符合一般的句型用法，因此，必須考慮到詞彙間的對應關係問題，使分析的結果具有可讀性。6.由於主題討論目標明確，回文者慣性省略這些詞彙進行發文，導致意見難以回收。　　本研究將針對上述問題進行後續討論及提出相關解決辦法，並實作一為單一領域設計之中文意見探勘系統。　　More and more people share their discussions on the Internet due to the spreading of internet and mobile devices. There is always much valuable information among the discussions. And if we can collect the information effectively, it will help researchers to make more efficient decisions. Developing Opinion Mining System can resolve this problem more efficiently for collecting and analyzing data by people consume too much human resources and time. 　　We can divide Opinion Mining System into three parts: researching the data on the Internet, analyzing and reporting statistics. And we encounter some problems in designing the system:1. The server on the Internet might decline to provide services while sending too many requests in the process of researching the data. 2. It’s a big challenge to establish a full word database for there are new words and phrases pop up all the time on the Internet. 3. There are specific sentiment words in each domain and all of them have to be processed. 4. There is limitation for the word database of each domain. Therefore, it’s necessary to find useful words and reduce the amount of words to raise the efficiency. 5. The results of analyses should fit for normal sentence usages. Therefore, it’s necessary to consider the context dependent among words to enhance the readability of the results. 6. People who reply on the Internet usually omit the words in their posting because the topics for discussions are always clear. Therefore, the opinions are hard to collect. 　　Our research focuses on the above-mentioned problems for further discussions and providing ways of solution, and also implements a Chinese Opinion Mining System for a single domain.
顯示於類別:	[資訊工程學系暨研究所] 學位論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	293	檢視/開啟

在機構典藏中所有的資料項目都受到原著作權保護.

TAIR相關文章

資料載入中.....