Many applications of sequential patterns require a guarantee of a particular event happening within a period of time. We propose CAI-PrefixSpan, a new data mining algorithm to obtain confident timed sequential patterns from sequential databases. Based on PrefixSpan, it takes advantage of the pattern-growth approach. After a particular event sequence, it would first calculate the confidence level regarding the eventual occurrence of a particular event. For those pass the minimal confidence requirement, it then computes the minimal time interval that satisfies the support requirement. It then generates corresponding projected databases, and applies itself recursively on the projected databases. With the timing information, it obtains fewer but more confident sequential patterns. CAI-PrefixSpan is implemented along with PrefixSpan. They are compared in terms of numbers of patterns obtained and execution efficiency. Our effectiveness and performance study shows that CAI-PrefixSpan is a valuable and efficient approach in obtaining timed sequential patterns.
Relation:
Proceedings of the 15th International Conference on Information Reuse and Integration