Efficient approaches for mining customer behaviors

淡江大學機構典藏 > 商管學院 > 管理科學學系暨研究所 > 學位論文 > Item 987654321/87548

請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/87548

題名:	Efficient approaches for mining customer behaviors
其他題名:	有效率的探勘客戶消費行為方法的研究
作者:	王秋光;Wang, Chiu-Kuang
貢獻者:	淡江大學管理科學學系博士班顏秀珍;歐陽良裕
關鍵詞:	資料探勘;頻繁項目集;頻繁封閉項目集;資料流;消費行為模式;data mining;Frequent itemset;Frequent closed itemset;Data stream;Consumption behaviors
日期:	2012
上傳時間:	2013-04-13 11:24:59 (UTC+8)
摘要:	資料豐富的資料庫在數位化之後已經普遍產生，如何從資料庫中挖掘重要的資訊是資料探勘的主要任務。在商業活動的應用中，我們可以從交易資料庫中分析常常一併購買的商品以及顧客在購買某些商品之後，可能也會購買其它商品的關聯行為，也就是探勘頻繁項目集，在論文中，我們提出很有效率的探勘頻繁項目集演算法，不論在執行時間或記憶體的使用量上都優於之前的研究。然而，新的交易資料會不斷產生，而舊有的交易資料必須被移除，若重新探勘原始資料，則會浪費時間重新找尋已知的資訊。隨時間不斷產生新資料與移除舊資料的環境稱之為資料流。因此在資料流的環境下，找出所有頻繁項目集開始被學者提出研究。另外，交易資料中的頻繁項目集可能非常多，眾多資訊會造成困擾，以致無法做決策。因此學者提出封閉項目集。若項目集的支持度比其所有超集合的支持度大，則此項目集稱為封閉項目集。由頻繁封閉項目集可衍生全部的頻繁項目集。我們也提出有效率的演算法，當資料不斷新增或被移除時，原始資料庫不需要被重新讀取，只需將新增或被移除的資料與舊有的封閉項目集做運算，就可產生更新後的封閉項目集。耗材性商品通常在所有商品中十分經常被購買，雖然單獨的獲利可能並沒有家電、電子商品這麼高，但是累積後的獲利卻不是小數值，所以若是能針對耗材性商品掌握正確的商機促銷，對於獲取重大利潤將有很大的幫助，而頻繁項目集無法提供促銷時機的資訊。因此，我們提出一個新穎的資料探勘方式，針對於某種耗材性商品，找出不同特徵的客戶對此商品的消費行為，根據客戶的背景屬性值以及此次購買某商品的數量，我們可以利用此消費行為，正確的預測出此顧客何時會再需要此商品，以掌握行銷此商品給此客戶的時機。 Mining frequent itemsets is to discover the groups of items appearing always together excess of a user specified threshold. Many approaches have been proposed for mining frequent itemsets by applying the FP-tree structure to improve the efficiency of the FP-Growth algorithm which needs to recursively construct sub-trees. Although these approaches do not need to recursively construct many sub-trees, they also suffer the problem of a large search space, such that the performances for the previous approaches degrade when the database is massive or the threshold for mining frequent itemsets is low. In order to reduce the search space and speed up the mining process, we propose an efficient algorithm for mining frequent itemsets based on frequent pattern tree. Our algorithm generates a sub-tree for each frequent item and then generates candidates in batch from this sub-tree. For each candidate generation, our algorithm only generates a small set of candidates, which can significantly reduce the search space. However, there may be many frequent itemsets existing in a transaction database, such that it is difficult to make a decision for a decision maker. Recently, mining frequent closed itemsets becomes a major research issue, since a set of the frequent closed itemsets is a condensed and complete representation of the frequent itemsets and all the frequent itemsets can be derived from the frequent closed itemsets. Because the transactions in a transaction database will grow rapidly in a short time, and some of the transactions may be antiquated. Consequently, the frequent closed itemsets may be changed due to the addition of the new transactions or the deletion of the old transactions from the transaction database. It is a challenge that how to update the previous closed itemsets when the transactions are added into or removed from the transaction database. We propose an efficient algorithm for incrementally mining closed itemsets without scanning the original database. Our algorithm updates closed itemsets by performing some operations on the previous closed itemsets and the added/deleted transactions without searching the previous closed itemsets. Compared with other commodities, consumable products are purchased high-frequently. Although single gains for consumable products may be lower than that of appliances or electronic products, the accumulative gains for consumable products are great. Therefore, grasping suitable timing to do sales promotion for consumable products is an important task. Sequential pattern mining only considers the sequential purchasing behaviors for most of the customers, but they cannot predict when the customer will need the products in the future. For the consumable products, the purchase time for the next transaction is usually related to the purchase quantities for this transaction. We propose a novel data mining algorithm to find the consumption behaviors for most of customers. From this information, we can predict the next purchased time for an item based on the purchased quantity of this item at this time.
顯示於類別:	[管理科學學系暨研究所] 學位論文

文件中的檔案:

檔案	大小	格式	瀏覽次數
index.html	0Kb	HTML	276	檢視/開啟

在機構典藏中所有的資料項目都受到原著作權保護.

TAIR相關文章

資料載入中.....