淡江大學機構典藏:Item 987654321/114434
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 62797/95867 (66%)
造访人次 : 3741681      在线人数 : 528
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/114434


    题名: 中文情感分析應用於PTT之研究
    其它题名: Improved Chinese sentiment analysis techniques for PTT data
    作者: 劉炅函;Liu, Gui-Han
    贡献者: 淡江大學統計學系碩士班
    陳景祥;Chen, Ching-Hsiang
    关键词: 文字探勘;情緒分析;詞彙極性;點互信息;text mining;Sentiment analysis;Semantic orientation;PMI
    日期: 2017
    上传时间: 2018-08-03 14:52:28 (UTC+8)
    摘要: 許多人會在網路上撰寫文章、彼此透過文字來交流,尤其年輕世代的人更是如此,人們在彼此交流時會有情緒的產生,同時,人們在撰寫文章時或多或少會將自己的情緒融入到文章內,比如對於某事件、某議題大眾網友的看法、情緒等。台灣大學批踢踢實業坊為現今具有代表性的討論區網站之一,其眾多的人口流量、大量的子討論區、特殊的系統架構、網友互動的方式等,產生了許多熱門文章與新穎的網路用語,經常被媒體拿來當作新聞題材。網路文章有部份詞彙具有其對應的情緒,可能為正面、也可能為負面,一般來說稱之為詞彙極性。在文字探勘領域,對於詞彙極性之標注採用人工的方式最為準確,但也最花費成本。本研究採用調整PMI的方法,期望達到自動化標注詞彙極性的部份;本研究對文章情緒分析的部份採用非監督式方法,因此不需要已標記過之訓練文章,只需要具有正負面極性之詞彙、否定詞、副詞等,與句子詞性組合做搭配來建構出文章情緒模型,藉此達到分類文章情緒之目的。
    Many modern people communicate with each other with writing articles,especially the younger generation. During communication, people show their emotions whenthey writing articles. These articles include comments on social events, issues, etc. PTT is one of today’s representative forum websites at Taiwan. Features of PTT include large population traffic, many different categories of sub-forum, a special system architecture, and the way users interact etc. Therefore, PTT also generates many popular articles and internet catchphrases, which are usually adopted and strengthened by news media.

    Vocabularies in internet articles have their corresponding emotions, which may be categorized as positive, negative or neutral and phrased as semantic orientations. So far, manual tagging is the most accurate way to judge the semantic orientations in text mining, with the disadvantage of higher cost. In this study, we use adjusted Pointwise Mutual Information (PMI) method to achieve auto-tagging of semantic orientations. Moreover, we use unsupervised learning method for the sentiment modeling without marked training data. With just negation words, adverb, adjective, positive and negative words etc, together with the sentence speech, we hope to achieve the purpose of classification of article’s emotions in PTT.
    显示于类别:[統計學系暨研究所] 學位論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML146检视/开启

    在機構典藏中所有的数据项都受到原著作权保护.

    TAIR相关文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回馈