English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 56431/90260 (63%)
造訪人次 : 11701461      線上人數 : 34
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    請使用永久網址來引用或連結此文件: http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/114434


    題名: 中文情感分析應用於PTT之研究
    其他題名: Improved Chinese sentiment analysis techniques for PTT data
    作者: 劉炅函;Liu, Gui-Han
    貢獻者: 淡江大學統計學系碩士班
    陳景祥;Chen, Ching-Hsiang
    關鍵詞: 文字探勘;情緒分析;詞彙極性;點互信息;text mining;Sentiment analysis;Semantic orientation;PMI
    日期: 2017
    上傳時間: 2018-08-03 14:52:28 (UTC+8)
    摘要: 許多人會在網路上撰寫文章、彼此透過文字來交流,尤其年輕世代的人更是如此,人們在彼此交流時會有情緒的產生,同時,人們在撰寫文章時或多或少會將自己的情緒融入到文章內,比如對於某事件、某議題大眾網友的看法、情緒等。台灣大學批踢踢實業坊為現今具有代表性的討論區網站之一,其眾多的人口流量、大量的子討論區、特殊的系統架構、網友互動的方式等,產生了許多熱門文章與新穎的網路用語,經常被媒體拿來當作新聞題材。網路文章有部份詞彙具有其對應的情緒,可能為正面、也可能為負面,一般來說稱之為詞彙極性。在文字探勘領域,對於詞彙極性之標注採用人工的方式最為準確,但也最花費成本。本研究採用調整PMI的方法,期望達到自動化標注詞彙極性的部份;本研究對文章情緒分析的部份採用非監督式方法,因此不需要已標記過之訓練文章,只需要具有正負面極性之詞彙、否定詞、副詞等,與句子詞性組合做搭配來建構出文章情緒模型,藉此達到分類文章情緒之目的。
    Many modern people communicate with each other with writing articles,especially the younger generation. During communication, people show their emotions whenthey writing articles. These articles include comments on social events, issues, etc. PTT is one of today’s representative forum websites at Taiwan. Features of PTT include large population traffic, many different categories of sub-forum, a special system architecture, and the way users interact etc. Therefore, PTT also generates many popular articles and internet catchphrases, which are usually adopted and strengthened by news media.

    Vocabularies in internet articles have their corresponding emotions, which may be categorized as positive, negative or neutral and phrased as semantic orientations. So far, manual tagging is the most accurate way to judge the semantic orientations in text mining, with the disadvantage of higher cost. In this study, we use adjusted Pointwise Mutual Information (PMI) method to achieve auto-tagging of semantic orientations. Moreover, we use unsupervised learning method for the sentiment modeling without marked training data. With just negation words, adverb, adjective, positive and negative words etc, together with the sentence speech, we hope to achieve the purpose of classification of article’s emotions in PTT.
    顯示於類別:[統計學系暨研究所] 學位論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML54檢視/開啟

    在機構典藏中所有的資料項目都受到原著作權保護.

    TAIR相關文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回饋