English  |  正體中文  |  简体中文  |  Items with full text/Total items : 58323/91867 (63%)
Visitors : 14038302      Online Users : 92
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/18325

    Title: Web document classification based on tagged-region progressive analysis
    Authors: Sung, Li-Chun;Chen, Meng-Chang;Kuo, Chin-Hwa
    Contributors: 淡江大學資訊工程學系
    Keywords: Web categorization;Progressive analysis
    Date: 2004-12
    Issue Date: 2009-08-25 14:15:46 (UTC+8)
    Abstract: In this paper, we propose an intelligent web document classification method, called TAgged-Region Progressive Analysis (TARPA). Instead of parsing the whole content of the web page while classifying a web document, TARPA parses the document into finer structured Tagged-Regions and extracts fewer and the most important regions to analyze and classify. If the few important tagged regions are not sufficient to allow TARPA to classify the document, other important regions and linked pages can be used for analysis progressively to enhance the classification performance. TARPA possesses two stages: learning stage and classification stage. The learning stage discriminates the importance of tag-pairs, and the classification stage follows the importance order of tag-pairs to analyze the document. As a result, TARPA can classify a web document using few contents while with higher classification rate and shorter processing time. Experiments show that 91% of the testing web documents can be correctly classified by only feeding the TARPA classifier with 40% to 50% of the document contents.
    Relation: Proceedings of Taipei and the International Computer Symposium 2004, pp.259-264
    Appears in Collections:[Graduate Institute & Department of Computer Science and Information Engineering] Proceeding

    Files in This Item:

    There are no files associated with this item.

    All items in 機構典藏 are protected by copyright, with all rights reserved.

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback