Algorithms for negative sequential pattern mining and fuzzy correlation rule mining

淡江大學機構典藏 > 工學院 > 資訊工程學系暨研究所 > 學位論文 > Item 987654321/35085

請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/35085

題名:	Algorithms for negative sequential pattern mining and fuzzy correlation rule mining
其他題名:	負相序列型樣與模糊相關規則之探勘方法
作者:	陳宏任;Chen, Hung-jen
貢獻者:	淡江大學資訊工程學系博士班林丕靜;Lin, Nancy Pei-ching
關鍵詞:	資料探勘;負相序列型樣;相關規則;頻繁型樣;data mining;Negative Sequential Pattern;Correlation Rule;Frequent Pattern
日期:	2008
上傳時間:	2010-01-11 05:59:38 (UTC+8)
摘要:	隨著資訊科技與資料自動收集設備的快速發展，已有大量的資料被收集並儲存在各式各樣的儲存設備上。從這些大量的資料中萃取出有價值的資訊加以運用，已成為企業組織提升競爭力的重要關鍵。資料探勘提供了在大量資料庫中挖掘出重要的、未知的以及可用的資訊的一些自動化方法。而頻繁型樣探勘在資料探勘領域中一直是個相當重要的研究議題。已有許多方法被提出來探勘各式各樣的頻繁型樣。這些型樣包括：頻繁項目集、關聯規則、序列型樣和相關規則等。本論文提出三個演算法分別探勘三種新型的頻繁型樣：負相序列型樣、負相模糊序列型樣和模糊相關規則。我們所提的負相序列型樣探勘演算法，有別於傳統的序列型樣演算法，不僅考慮到交易紀錄中有出現的項目，同時也把沒有出現在交易紀錄上的項目考慮進來。在探勘過程中,我們運用先驗法則來減少多餘候選序列的產生。並且，我們定義了一個以條件機率為基礎的蘊義衡量函數(Interestingness measure),藉以找出涵義較高的負相序列型樣。因為大多數實際的交易資料都是數量型資料，為了分析資料庫中具有數量值的項目，我們結合模糊集合論與前述負相序列型樣探勘演算法，提出另一演算法:負相模糊型樣探勘演算法。該演算法將數量型項目模糊化為模糊項目，從而在數量型資料庫中找出負相模糊序列型樣。此外，我們也針對模糊相關規則的探勘提出演算法。該演算法藉由模糊相關分析來判斷兩個模糊項目集在資料庫中的相關性，從而在數量型資料庫中探勘出模糊相關規則。實驗顯示，我們所提的三個演算法能夠有效避免產生大量多餘的候選項目集和候選序列，並且能夠找出數量較精簡、涵意較高的頻繁型樣。 Due to rapid developments in information technology and automatic data collection tools, a large amount of data has been collected and stored in various data repositories. To extract valuable information from these data is the key to improve business competition. Data mining offers ways to automatically find nontrivial, previously unknown, and potentially useful knowledge from large databases. Mining of frequent patterns plays an essential role in data mining. Many methods have been proposed for discovering various types of frequent patterns such as frequent itemsets, association rules, correlation rules, and sequential patterns. In this dissertation, three types of frequent patterns, namely, negative sequential patterns, negative fuzzy sequential patterns, and fuzzy correlation rules, have been introduced. We propose an algorithm for mining negative sequential patterns, which consider not only the occurrence of itemsets in transactions in databases but also their absence. In this algorithm, we have designed a candidate generation procedure employing the apriori principle to eliminate many redundant candidates during the mining task. Moreover, in this method, we also define a function based on the conditional probability theory to measure the interestingness of sequences in order to find more interesting negative sequential patterns. Additionally, most transaction data in real-world applications usually consist of quantitative values. In order to investigate various types of data in quantitative databases and then discover negative sequential patterns from such databases, we propose an algorithm, which combines fuzzy-set theory and negative sequential pattern concept, for mining negative fuzzy sequential patterns from quantitative databases. Furthermore, we propose a method for mining fuzzy correlation rules, which applies fuzzy correlation analysis to determine whether two sub-fuzzy itemsets in a fuzzy itemset are dependent, and then extract more interesting fuzzy correlation rules from quantitative databases. Experiments in the three proposed algorithms show that our algorithms can prune a lot of redundant candidates during the process of mining tasks and can effectively extract frequent patterns that are actually interesting.
顯示於類別:	[資訊工程學系暨研究所] 學位論文

文件中的檔案:

檔案	大小	格式	瀏覽次數
	0Kb	Unknown	419	檢視/開啟

在機構典藏中所有的資料項目都受到原著作權保護.

TAIR相關文章

資料載入中.....