English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 57042/90725 (63%)
造訪人次 : 12463537      線上人數 : 291
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    請使用永久網址來引用或連結此文件: http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/18325

    題名: Web document classification based on tagged-region progressive analysis
    作者: Sung, Li-Chun;Chen, Meng-Chang;Kuo, Chin-Hwa
    貢獻者: 淡江大學資訊工程學系
    關鍵詞: Web categorization;Progressive analysis
    日期: 2004-12
    上傳時間: 2009-08-25 14:15:46 (UTC+8)
    摘要: In this paper, we propose an intelligent web document classification method, called TAgged-Region Progressive Analysis (TARPA). Instead of parsing the whole content of the web page while classifying a web document, TARPA parses the document into finer structured Tagged-Regions and extracts fewer and the most important regions to analyze and classify. If the few important tagged regions are not sufficient to allow TARPA to classify the document, other important regions and linked pages can be used for analysis progressively to enhance the classification performance. TARPA possesses two stages: learning stage and classification stage. The learning stage discriminates the importance of tag-pairs, and the classification stage follows the importance order of tag-pairs to analyze the document. As a result, TARPA can classify a web document using few contents while with higher classification rate and shorter processing time. Experiments show that 91% of the testing web documents can be correctly classified by only feeding the TARPA classifier with 40% to 50% of the document contents.
    關聯: Proceedings of Taipei and the International Computer Symposium 2004, pp.259-264
    顯示於類別:[資訊工程學系暨研究所] 會議論文





    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回饋