English  |  正體中文  |  简体中文  |  Items with full text/Total items : 49647/84944 (58%)
Visitors : 7703235      Online Users : 86
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/44517


    Title: 利用關聯法則改善文件分類準確度-類別優先問題之探討
    Other Titles: Improving the Accuracy of Text Categorization by Using Association Rule with Class Priority
    Authors: 熊瀚升;Hsiung, Han-sheng
    Contributors: 淡江大學資訊工程學系碩士在職專班
    蔣定安;Chiang, Ding-an
    Keywords: 關聯式分類法;規則排序;規則產生;候選詞彙項目
    Date: 2009
    Issue Date: 2010-03-16 10:36:07 (UTC+8)
    Abstract: 一般關聯式分類法(Associative Classification, AC)在規則排序(Ranking)[1][2]上,作法是先依照信賴值由高至低排序,接著依支援值由高至低排序,再依規則由短至長排序,短規則因為通用性較高,通常為了讓更多文件可以分類,因此短規則在排序上優於長規則。
    本論文核心即在針對規則排列問題,除了採用Lazy法[3]所提出的排序法則為一般排序原則外,再加上本論文提出之類別優先度來探討其對分類效能的影響。再結合TFIDF[4]及貝氏分類器[5]先做第一次分類,計算其準確率及F1值,利用這些數據設定單一門檻值、為了避免不同類別間的落差,針對各類別設定多重門檻值,並利用靜態不變及動態修正門檻值兩種方式來引用規則並執行分類。
    General relational classification (Associative Classification, AC) in the rules of order (Ranking) [1] [2], the approach is to rely on the value of pupils in accordance with the order, and then sorted according to support the value of pupils, according to the rules Sorting by short to long and short rules because of the higher common, usually in order to allow more files can be categorized, so in short order on the rules of the rules is better than long.
    In this paper, that is the core of the problem in order for the rules, in addition to the use of Lazy method [3] by the law of the sort order for the general principles, together with the categories proposed in this paper to discuss the priority of its impact on the classification performance. Combined with the TFIDF [4] and Bayesian classifier [5] first classified the first time to calculate their accuracy rate and the F1 value, use the data to set a single threshold value, in order to avoid differences between different categories for each of the categories to set multiple threshold value, and use the same static and dynamic threshold amended to refer to two ways and the implementation of classification rules.
    Appears in Collections:[資訊工程學系暨研究所] 學位論文

    Files in This Item:

    File SizeFormat
    0KbUnknown159View/Open

    All items in 機構典藏 are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback