透過實驗結果,可以發現本論文所得到的群資料的確能夠將使用者的瀏覽相關性高的頁面集中在同一群內,與關聯性法則超圖形分割法(Association Rule Hypergraph Partition)所得到的資料相比,本論文所得到的分群結果除了比較準確,資料品質也較佳。 Analyzing and understanding user behavior in browsing a web site is an important issue in web site developments, however, this capability is seldom an integral part of the design process when building the web site. It is a challenging task to add such capability to an existing and running web server due to the engineering consideration of modifying potentially large amount of web pages.
This thesis uses ISAPI filter to inject cookies into HTTP transaction in order to identify individual user. This method can be applied to existing system with minor modifications. The main goal of data clustering is to partition data set into clusters, so that the data in each cluster share some common trait. This thesis proposes a method to cluster data items bases on the large itemsets which come from association rule analysis, instead of some commonly known distance measure.
Empirical data are collected from an existing web server and the resulting clusters are analyzed and compared with the commonly used “Association Rule Hypergraph Partitioning”. The experiment shows the method we proposed can get more pertinent results as compare to “Association Rule Hypergraph Partitioning” and also at the same time, the method can prune infrequent data items.