Mining frequent itemsets is to discover the groups of items appearing always together excess of a user specified threshold. Many approaches have been proposed for mining frequent itemsets by applying the FP-tree structure to improve the efficiency of the FP-Growth algorithm which needs to recursively construct sub-trees. Although these approaches do not need to recursively construct many sub-trees, they also suffer the problem of a large search space, such that the performances for the previous approaches degrade when the database is massive or the threshold for mining frequent itemsets is low. In order to reduce the search space and speed up the mining process, we propose an efficient algorithm for mining frequent itemsets based on frequent pattern tree. Our algorithm generates a sub-tree for each frequent item and then generates candidates in batch from this sub-tree. For each candidate generation, our algorithm only generates a small set of candidates, which can significantly reduce the search space.
However, there may be many frequent itemsets existing in a transaction database, such that it is difficult to make a decision for a decision maker. Recently, mining frequent closed itemsets becomes a major research issue, since a set of the frequent closed itemsets is a condensed and complete representation of the frequent itemsets and all the frequent itemsets can be derived from the frequent closed itemsets. Because the transactions in a transaction database will grow rapidly in a short time, and some of the transactions may be antiquated. Consequently, the frequent closed itemsets may be changed due to the addition of the new transactions or the deletion of the old transactions from the transaction database. It is a challenge that how to update the previous closed itemsets when the transactions are added into or removed from the transaction database. We propose an efficient algorithm for incrementally mining closed itemsets without scanning the original database. Our algorithm updates closed itemsets by performing some operations on the previous closed itemsets and the added/deleted transactions without searching the previous closed itemsets.
Compared with other commodities, consumable products are purchased high-frequently. Although single gains for consumable products may be lower than that of appliances or electronic products, the accumulative gains for consumable products are great. Therefore, grasping suitable timing to do sales promotion for consumable products is an important task. Sequential pattern mining only considers the sequential purchasing behaviors for most of the customers, but they cannot predict when the customer will need the products in the future. For the consumable products, the purchase time for the next transaction is usually related to the purchase quantities for this transaction. We propose a novel data mining algorithm to find the consumption behaviors for most of customers. From this information, we can predict the next purchased time for an item based on the purchased quantity of this item at this time.