Sequential pattern mining is an important subfield in data mining. Recently, applications using time interval-based event data have attracted considerable efforts in discovering patterns from events that persist for some duration. Since the relationship between two intervals is intrinsically complex, how to effectively and efficiently mine interval-based sequences is a challenging issue. In this paper, two novel representations, endpoint representation and endtime representation, are proposed to simplify the processing of complex relationships among event intervals. Based on the proposed representations, three types of interval-based patterns: temporal pattern, occurrence-probabilistic temporal pattern and duration-probabilistic temporal pattern, are defined. In addition, we develop two novel algorithms, Temporal Pattern Miner (TPMiner) and Probabilistic Temporal Pattern Miner (P-TPMiner), to discover three types of interval-based sequential patterns. We also propose three pruning techniques to further reduce the search space of the mining process. Experimental studies show that both algorithms are able to find three types of patterns efficiently. Furthermore, we apply proposed algorithms to real datasets to demonstrate the effectiveness and validate the practicability of proposed patterns.
IEEE Transactions on Knowledge and Data Engineering 27(12), pp.3318-3331