English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 51296/86402 (59%)
造訪人次 : 8152799      線上人數 : 122
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    請使用永久網址來引用或連結此文件: http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/87962


    題名: MOBILE01和PTT兩個不同論壇相同面向習慣用詞之探討 : 以電信, 寬頻版為例
    其他題名: MOBILE01 and PTT two different forums discussion on the same face habit words : a case study of telecommunications, broadband
    作者: 瞿怡正;Chu, Yi-Cheng
    貢獻者: 淡江大學資訊工程學系碩士在職專班
    蔣璿東
    關鍵詞: 初始詞庫;人工標註;Initial Lexicon;Manual Tagging
    日期: 2012
    上傳時間: 2013-04-13 11:54:15 (UTC+8)
    摘要: 本論文將針對兩種不同類型的論壇做比較,對相同領域而言,觀察兩種不同類型論壇詞典外的習慣用語是否相同?進而討論(1)利用其中一個較專業和發表與特定領域相關文章較多之論壇所訓練完成的詞庫當作另一論壇的初始詞庫是否有助於準確率和回收率的提升?(2)有了初始詞庫,為節省大量人力,是否可跳過第一階段人工標註的動作,直接利用系統第二階段的動作直接擷取文章中的意見詞或排除字?本論文會利用PTT與Mobile01的電信版和寬頻版來討論上述問題。依據實驗數據顯示,如果使用Mobile01排除字詞庫當初始詞庫,確實可以改善系統的準確率和回收率;但就非冷門的詞庫外意見詞使用習慣而言,PTT和Mobile01使用者的習慣用語仍有些許不同且數量不多,所以即便是使用Mobile01訓練完成的詞庫當作PTT初始詞庫,為了維護較高的準確率和回收率,不建議跳過第一階段人工標註的動作;但如果為了節省人力可以跳過第一階段人工標註的動作,直接利用系統第二階段的動作直接擷取文章中的意見詞或排除字。
    This paper compares the forums of two different types to observe whether the idioms/terminologies outside the dictionaries of the two different types of forums are the same for the same field before further discussions (1) whether it is helpful in improving precision and recall to use the trained lexicon of a more professional forum with more published articles relating to a specific field as the initial lexicon of another forum. (2)Using the initial lexicon, this paper attempts to discuss whether the manual tagging operation at the first stage can be skipped to directly capture the opinion words or exclusion words of the articles by using the system operations of the second stage. This paper discusses the above problems by using PTT and Mobile01 telecommunications page and broadband page. According to experimental data, using Mobile01 exclusion words lexicon as the initial lexicon can improve the system precision and recall. However, for the usage of unpopular opinion words outside the lexicon, idioms/phrases of PTT and Mobile01 users may slightly differ in small number. Therefore, even if using the Mobile01 trained lexicon as the PTT initial lexicon, to keep relatively high level of precision and recall, it is not recommended to skip the manual tagging operation of the first stage. However, in order to save manpower, the manual tagging operation of the first stage can be skipped to directly use the system operations of the second stage to directly capture opinion words or exclusion words from the article.
    顯示於類別:[資訊工程學系暨研究所] 學位論文

    文件中的檔案:

    檔案 大小格式瀏覽次數
    index.html0KbHTML170檢視/開啟

    在機構典藏中所有的資料項目都受到原著作權保護.

    TAIR相關文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回饋