English  |  正體中文  |  简体中文  |  Items with full text/Total items : 62822/95882 (66%)
Visitors : 4028606      Online Users : 577
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/114434


    Title: 中文情感分析應用於PTT之研究
    Other Titles: Improved Chinese sentiment analysis techniques for PTT data
    Authors: 劉炅函;Liu, Gui-Han
    Contributors: 淡江大學統計學系碩士班
    陳景祥;Chen, Ching-Hsiang
    Keywords: 文字探勘;情緒分析;詞彙極性;點互信息;text mining;Sentiment analysis;Semantic orientation;PMI
    Date: 2017
    Issue Date: 2018-08-03 14:52:28 (UTC+8)
    Abstract: 許多人會在網路上撰寫文章、彼此透過文字來交流,尤其年輕世代的人更是如此,人們在彼此交流時會有情緒的產生,同時,人們在撰寫文章時或多或少會將自己的情緒融入到文章內,比如對於某事件、某議題大眾網友的看法、情緒等。台灣大學批踢踢實業坊為現今具有代表性的討論區網站之一,其眾多的人口流量、大量的子討論區、特殊的系統架構、網友互動的方式等,產生了許多熱門文章與新穎的網路用語,經常被媒體拿來當作新聞題材。網路文章有部份詞彙具有其對應的情緒,可能為正面、也可能為負面,一般來說稱之為詞彙極性。在文字探勘領域,對於詞彙極性之標注採用人工的方式最為準確,但也最花費成本。本研究採用調整PMI的方法,期望達到自動化標注詞彙極性的部份;本研究對文章情緒分析的部份採用非監督式方法,因此不需要已標記過之訓練文章,只需要具有正負面極性之詞彙、否定詞、副詞等,與句子詞性組合做搭配來建構出文章情緒模型,藉此達到分類文章情緒之目的。
    Many modern people communicate with each other with writing articles,especially the younger generation. During communication, people show their emotions whenthey writing articles. These articles include comments on social events, issues, etc. PTT is one of today’s representative forum websites at Taiwan. Features of PTT include large population traffic, many different categories of sub-forum, a special system architecture, and the way users interact etc. Therefore, PTT also generates many popular articles and internet catchphrases, which are usually adopted and strengthened by news media.

    Vocabularies in internet articles have their corresponding emotions, which may be categorized as positive, negative or neutral and phrased as semantic orientations. So far, manual tagging is the most accurate way to judge the semantic orientations in text mining, with the disadvantage of higher cost. In this study, we use adjusted Pointwise Mutual Information (PMI) method to achieve auto-tagging of semantic orientations. Moreover, we use unsupervised learning method for the sentiment modeling without marked training data. With just negation words, adverb, adjective, positive and negative words etc, together with the sentence speech, we hope to achieve the purpose of classification of article’s emotions in PTT.
    Appears in Collections:[Graduate Institute & Department of Statistics] Thesis

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML151View/Open

    All items in 機構典藏 are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback