English  |  正體中文  |  简体中文  |  Items with full text/Total items : 52047/87178 (60%)
Visitors : 8717935      Online Users : 144
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/34130

    Title: CHSPM:一個完整的混合循序樣式探勘演算法
    Other Titles: Chspm: a complete hybrid sequential patterns mining algorithm
    Authors: 原孝任;Yuan, Hsiau-ren
    Contributors: 淡江大學資訊管理學系碩士班
    周清江;Jou, Chichang
    Date: 2005
    Issue Date: 2010-01-11 04:57:17 (UTC+8)
    Abstract: 現有循序樣式探勘的研究依照樣式中相連的項目是否必須在交易紀錄中緊密相連可粗略的分為以下三類,第一類為找出連續循序樣式;第二類為找出非連續循序樣式;第三類為找出混合循序樣式。過去混合循序樣式探勘的演算法都以Apriori為基礎,但這些方法探勘出的結果並不完整,所以我們針對混合循序樣式探勘,以樣式成長(pattern-growth)方法為基礎,提出一個新的演算法CHSPM(A Complete Hybrid Sequential Patterns Mining Algorithm),以窮舉法來找出完整之混合循序樣式。
    CHSPM演算法有以下四個步驟,分別為:1.產生增補一階頻繁樣式;2. 縮減資料庫;3. 分割資料庫,建立投影資料庫;4. 探勘投影資料庫,建立子投影資料庫,直到找出所有的混合循序樣式。
    為了驗證CHSPM的探勘結果,我們使用10萬至30萬筆的模擬資料來進行實驗,並與過去探勘混合循序樣式效率最佳的GFP2 演算法比較。實驗結果顯示,雖然CHSPM在效能上不如GFP2,但可以探勘出完整的混合循序樣式。
    Based on whether consecutive items in sequential patterns should also be consecutive in the transactions, existing researches about sequential pattern mining could be classified into the following three categories: The first is to find continuous patterns; the second is to find discontinuous patterns; the third is to find hybrid patterns that combine both continuous patterns and discontinuous patterns. Previous hybrid sequential pattern mining algorithms were all based on the Apriori algorithm, but we discovered that their mining results are incomplete. Thus, based on the pattern-growth method, we propose a new algorithm (CHSPM) to find complete hybrid sequential patterns.
    The four steps of CHSPM are as follows: 1. Build the supplemented frequent-1-sequence item set; 2. Reduce the database by erasing unimportant items from the transactions. 3. Partition the database, and build projected databases. 4. Recursively mine the projected databases and build sub-projected databases until all hybrid sequential patterns are found.
    Finally, we use synthetic databases of 100,000 to 300,000 records to test our algorithm, and to compare our results with those of GFP2, the most efficient algorithm in hybrid sequential pattern mining up to now. The result shows that even though CHSPM is slower than GFP2, it can find out complete hybrid sequential patterns.
    Appears in Collections:[資訊管理學系暨研究所] 學位論文

    Files in This Item:

    File SizeFormat

    All items in 機構典藏 are protected by copyright, with all rights reserved.

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback