淡江大學機構典藏:Item 987654321/34130
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 62819/95882 (66%)
造访人次 : 4001637      在线人数 : 638
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/34130


    题名: CHSPM:一個完整的混合循序樣式探勘演算法
    其它题名: Chspm: a complete hybrid sequential patterns mining algorithm
    作者: 原孝任;Yuan, Hsiau-ren
    贡献者: 淡江大學資訊管理學系碩士班
    周清江;Jou, Chichang
    日期: 2005
    上传时间: 2010-01-11 04:57:17 (UTC+8)
    摘要: 現有循序樣式探勘的研究依照樣式中相連的項目是否必須在交易紀錄中緊密相連可粗略的分為以下三類,第一類為找出連續循序樣式;第二類為找出非連續循序樣式;第三類為找出混合循序樣式。過去混合循序樣式探勘的演算法都以Apriori為基礎,但這些方法探勘出的結果並不完整,所以我們針對混合循序樣式探勘,以樣式成長(pattern-growth)方法為基礎,提出一個新的演算法CHSPM(A Complete Hybrid Sequential Patterns Mining Algorithm),以窮舉法來找出完整之混合循序樣式。
    CHSPM演算法有以下四個步驟,分別為:1.產生增補一階頻繁樣式;2. 縮減資料庫;3. 分割資料庫,建立投影資料庫;4. 探勘投影資料庫,建立子投影資料庫,直到找出所有的混合循序樣式。
    為了驗證CHSPM的探勘結果,我們使用10萬至30萬筆的模擬資料來進行實驗,並與過去探勘混合循序樣式效率最佳的GFP2 演算法比較。實驗結果顯示,雖然CHSPM在效能上不如GFP2,但可以探勘出完整的混合循序樣式。
    Based on whether consecutive items in sequential patterns should also be consecutive in the transactions, existing researches about sequential pattern mining could be classified into the following three categories: The first is to find continuous patterns; the second is to find discontinuous patterns; the third is to find hybrid patterns that combine both continuous patterns and discontinuous patterns. Previous hybrid sequential pattern mining algorithms were all based on the Apriori algorithm, but we discovered that their mining results are incomplete. Thus, based on the pattern-growth method, we propose a new algorithm (CHSPM) to find complete hybrid sequential patterns.
    The four steps of CHSPM are as follows: 1. Build the supplemented frequent-1-sequence item set; 2. Reduce the database by erasing unimportant items from the transactions. 3. Partition the database, and build projected databases. 4. Recursively mine the projected databases and build sub-projected databases until all hybrid sequential patterns are found.
    Finally, we use synthetic databases of 100,000 to 300,000 records to test our algorithm, and to compare our results with those of GFP2, the most efficient algorithm in hybrid sequential pattern mining up to now. The result shows that even though CHSPM is slower than GFP2, it can find out complete hybrid sequential patterns.
    显示于类别:[資訊管理學系暨研究所] 學位論文

    文件中的档案:

    档案 大小格式浏览次数
    0KbUnknown509检视/开启

    在機構典藏中所有的数据项都受到原著作权保护.

    TAIR相关文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回馈