淡江大學機構典藏:Item 987654321/18325
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 62797/95867 (66%)
造访人次 : 3728613      在线人数 : 714
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/18325


    题名: Web document classification based on tagged-region progressive analysis
    作者: Sung, Li-Chun;Chen, Meng-Chang;Kuo, Chin-Hwa
    贡献者: 淡江大學資訊工程學系
    关键词: Web categorization;Progressive analysis
    日期: 2004-12
    上传时间: 2009-08-25 14:15:46 (UTC+8)
    摘要: In this paper, we propose an intelligent web document classification method, called TAgged-Region Progressive Analysis (TARPA). Instead of parsing the whole content of the web page while classifying a web document, TARPA parses the document into finer structured Tagged-Regions and extracts fewer and the most important regions to analyze and classify. If the few important tagged regions are not sufficient to allow TARPA to classify the document, other important regions and linked pages can be used for analysis progressively to enhance the classification performance. TARPA possesses two stages: learning stage and classification stage. The learning stage discriminates the importance of tag-pairs, and the classification stage follows the importance order of tag-pairs to analyze the document. As a result, TARPA can classify a web document using few contents while with higher classification rate and shorter processing time. Experiments show that 91% of the testing web documents can be correctly classified by only feeding the TARPA classifier with 40% to 50% of the document contents.
    關聯: Proceedings of Taipei and the International Computer Symposium 2004, pp.259-264
    显示于类别:[資訊工程學系暨研究所] 會議論文

    文件中的档案:

    没有与此文件相关的档案.

    在機構典藏中所有的数据项都受到原著作权保护.

    TAIR相关文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回馈