中文情感分析應用於PTT之研究

淡江大學機構典藏 > 商管學院 > 統計學系暨研究所 > 學位論文 > Item 987654321/114434

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/114434

题名:	中文情感分析應用於PTT之研究
其它题名:	Improved Chinese sentiment analysis techniques for PTT data
作者:	劉炅函;Liu, Gui-Han
贡献者:	淡江大學統計學系碩士班陳景祥;Chen, Ching-Hsiang
关键词:	文字探勘;情緒分析;詞彙極性;點互信息;text mining;Sentiment analysis;Semantic orientation;PMI
日期:	2017
上传时间:	2018-08-03 14:52:28 (UTC+8)
摘要:	許多人會在網路上撰寫文章、彼此透過文字來交流，尤其年輕世代的人更是如此，人們在彼此交流時會有情緒的產生，同時，人們在撰寫文章時或多或少會將自己的情緒融入到文章內，比如對於某事件、某議題大眾網友的看法、情緒等。台灣大學批踢踢實業坊為現今具有代表性的討論區網站之一，其眾多的人口流量、大量的子討論區、特殊的系統架構、網友互動的方式等，產生了許多熱門文章與新穎的網路用語，經常被媒體拿來當作新聞題材。網路文章有部份詞彙具有其對應的情緒，可能為正面、也可能為負面，一般來說稱之為詞彙極性。在文字探勘領域，對於詞彙極性之標注採用人工的方式最為準確，但也最花費成本。本研究採用調整PMI的方法，期望達到自動化標注詞彙極性的部份；本研究對文章情緒分析的部份採用非監督式方法，因此不需要已標記過之訓練文章，只需要具有正負面極性之詞彙、否定詞、副詞等，與句子詞性組合做搭配來建構出文章情緒模型，藉此達到分類文章情緒之目的。 Many modern people communicate with each other with writing articles,especially the younger generation. During communication, people show their emotions whenthey writing articles. These articles include comments on social events, issues, etc. PTT is one of today’s representative forum websites at Taiwan. Features of PTT include large population traffic, many different categories of sub-forum, a special system architecture, and the way users interact etc. Therefore, PTT also generates many popular articles and internet catchphrases, which are usually adopted and strengthened by news media. Vocabularies in internet articles have their corresponding emotions, which may be categorized as positive, negative or neutral and phrased as semantic orientations. So far, manual tagging is the most accurate way to judge the semantic orientations in text mining, with the disadvantage of higher cost. In this study, we use adjusted Pointwise Mutual Information (PMI) method to achieve auto-tagging of semantic orientations. Moreover, we use unsupervised learning method for the sentiment modeling without marked training data. With just negation words, adverb, adjective, positive and negative words etc, together with the sentence speech, we hope to achieve the purpose of classification of article’s emotions in PTT.
显示于类别:	[統計學系暨研究所] 學位論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	201	检视/开启

在機構典藏中所有的数据项都受到原著作权保护.

TAIR相关文章

数据加载中.....