以關連性法則分析結果為基礎的資料分群法 : 應用在網頁瀏覽紀錄分析

淡江大學機構典藏 > 工學院 > 資訊工程學系暨研究所 > 學位論文 > Item 987654321/35273

請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/35273

題名:	以關連性法則分析結果為基礎的資料分群法 : 應用在網頁瀏覽紀錄分析
其他題名:	Data clustering based on results of association rules : with analysis of web browsing patterns application 以關連分析結果為基礎的資料分群法 : 應用在網頁瀏覽紀錄分析
作者:	周世寧;Chou, Shih-ning
貢獻者:	淡江大學資訊工程學系碩士班蔡憶佳
關鍵詞:	關聯性法則;群聚演算法;超圖形分割法;網頁探勘;網站使用探勘;association rule;clustering algorithm;ISAPI;cookie;hypergraph partitioning;web mining;Web usage mining
日期:	2005
上傳時間:	2010-01-11 06:17:46 (UTC+8)
摘要:	雖然分析網站使用者瀏覽行為對於網站經營者有其重要性及必要性，不過實務上由於網站的建置初期往往不會考慮到是否未來有分析使用者瀏覽行為的需求，所以要在既有網站上建立一個分析系統其實是困難重重，以分析網誌為例，要如何確認網誌上面的哪些記錄是來自於同一使用者的行為就是一個非常大的挑戰，本論文提出一種方式利用ISAPI 過濾器配合cookie的技術，在實務上同時兼顧可行性與準確性來辨別個別的網站使用者，而不需要更動已開發的系統與程式碼。　　過去在關聯性法則運用在網路探勘的相關研究大多著重於找出不同網頁彼此的關聯性藉以產生具意義的規則，本論文則是利用關聯性法則分析的結果，透過合併關聯性緊密的資料項目集同時排除內部資料關係鬆散的資料項目集，藉此產生內部資料關聯性高的資料群，分析的過程中，也同時將關聯性較低的資料排除於資料群之外，同時確保資料品質的一致性。得到資料特性也有別於傳統以距離量測為基礎的資料分群法所產生的群資料特性。　　透過實驗結果，可以發現本論文所得到的群資料的確能夠將使用者的瀏覽相關性高的頁面集中在同一群內，與關聯性法則超圖形分割法（Association Rule Hypergraph Partition）所得到的資料相比，本論文所得到的分群結果除了比較準確，資料品質也較佳。 Analyzing and understanding user behavior in browsing a web site is an important issue in web site developments, however, this capability is seldom an integral part of the design process when building the web site. It is a challenging task to add such capability to an existing and running web server due to the engineering consideration of modifying potentially large amount of web pages. This thesis uses ISAPI filter to inject cookies into HTTP transaction in order to identify individual user. This method can be applied to existing system with minor modifications. The main goal of data clustering is to partition data set into clusters, so that the data in each cluster share some common trait. This thesis proposes a method to cluster data items bases on the large itemsets which come from association rule analysis, instead of some commonly known distance measure. Empirical data are collected from an existing web server and the resulting clusters are analyzed and compared with the commonly used “Association Rule Hypergraph Partitioning”. The experiment shows the method we proposed can get more pertinent results as compare to “Association Rule Hypergraph Partitioning” and also at the same time, the method can prune infrequent data items.
顯示於類別:	[資訊工程學系暨研究所] 學位論文

文件中的檔案:

檔案	大小	格式	瀏覽次數
	0Kb	Unknown	400	檢視/開啟

在機構典藏中所有的資料項目都受到原著作權保護.

TAIR相關文章

資料載入中.....