基於不同相似尺度之多元整合式分群法於基因表現資料的群集分析

淡江大學機構典藏 > 理學院 > 應用數學與數據科學學系 > 學位論文 > Item 987654321/74180

Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/74180

Title:	基於不同相似尺度之多元整合式分群法於基因表現資料的群集分析
Other Titles:	Multiple ensemble clustering based on different similarity measures for gene expression data
Authors:	李牧學;Li, Mu-Hsueh
Contributors:	淡江大學數學學系碩士班吳漢銘;Wu, Han-Ming
Keywords:	群集分析;相關係數;整合式分群;相似尺度;階層式分群法;K 均值法;分割環繞物件法;一致性分群法;Clustering;consensus clustering;ensemble clustering;gene expression;hierarchical clustering tree;K-means;partitioning around medoids;similarity measures
Date:	2011
Issue Date:	2011-12-28 18:13:39 (UTC+8)
Abstract:	微陣列資料群集分析的目的是為了找出在不同的實驗條件之下具有相似功能的基因表現。不同的相似尺度之下, 與使用不同的群集分析方法皆可導致不同的分群結果。本研究中,我們使用Pearson、Kendall、Spearman 三種不同的相關係數以及歐式距離尺度, 分別運行階層分群樹(HCT)、K均值(K-means)、分割環繞物件法(PAM)、一致性分群法(Consensus clustering) 與整合式分群法(Ensemble clustering) 。我們整合這些群集結果, 得到資料最後的分群, 期望得到較穩定的分群結果, 我們將以一組模擬資料與一組微陣列基因資料來說明與討論我們所提的方法。 Unsupervised clustering methods have been widely applied to the analysis of gene expression data to identify biologically relevant groups of genes. Using different clustering algorithms with various similarity measures usually results in quite different gene clusters. To lessen these effects, we propose a new clustering method by integrating various clustering algorithms based on three similarity measures. The proposed method, which we called the multiple ensemble clustering, averages the consensus results from the hierarchical clustering, the K-means, and the partitioning around medoids based on the Pearson rho, Kendall tau, and Spearman rank correlations. We use a simulated and a real data set to illustrate the proposed method. The validity indices indicate that the multiple ensemble clustering provide a much more stable clustering result.
Appears in Collections:	[應用數學與數據科學學系] 學位論文

Files in This Item:

File	Size	Format
index.html	0Kb	HTML	415	View/Open

Loading...