利用關聯演算法重現決策樹分類結果

淡江大學機構典藏 > 工學院 > 資訊工程學系暨研究所 > 學位論文 > Item 987654321/94455

請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/94455

題名:	利用關聯演算法重現決策樹分類結果
其他題名:	Using association classification to represent a decision tree
作者:	陳俊旗;Chen, Chun-Chi
貢獻者:	淡江大學資訊工程學系博士班蔣璿東;Chiang, Rui-Dong
關鍵詞:	決策樹;關聯分類;不相關決策條件;全域決策規則;Decision tree;Association Classification;Irrelevant Note;Global rule
日期:	2013
上傳時間:	2014-01-23 14:39:52 (UTC+8)
摘要:	分類（Classification）是資料探勘（Data Mining）常用策略之一，而關聯演算法（Association Classification）與決策樹（Decision tree）更是分類上經常使用的方法。雖然關聯演算法的主要優點在提供全域決策規則（Global Decision Rules），但關聯演算法無法直接處理連續型數值資料，先行對連續型數值進行離散化處理，而要找最佳數值切點是一個NP-hard問題；且關聯演算法雖可以找出所有決策規則，但由於規則數量過多，較難建構一個完整知識解釋結構。另一方面，雖然決策樹能夠直接處理連續型數值與非連續型數值欄位及易於產生明確的知識結構，但決策樹因樹狀結構與演算法的限制，因此在由決策樹轉換後的決策規則是屬於區域決策規則（Local Decision Rules）且決策規則當中可能存在不相關決策條件（Irrelevant Classification Condition）。因此，本論文將針對此問題，提出解決方法。本研究首先利用決策樹在連續數值屬性以及快速處理的特性，找出隱藏在資料集內的知識與決策規則，同時藉由決策樹演算法的區域性（local）特性快速度出所有可能連續數值屬性的離散化切點集。而後再將決策樹轉換後決策規則與離散化切點集重新利用關聯演算法整理，將決策樹決策規則重新轉換成為以全域性、移除不相關決策條件及條件更簡單的關聯決策規則。最後，在本研究中利用卵巢子宮內膜異位症臨床資料集進行實驗，實驗结果表明，對比CART決策樹生成的決策規則在原始決策規則之下，提出分類精度較高、條件更簡單且可理解性強的關聯決策規則。 Since the derived rules of decision trees are local, the association classifier has higher accuracy than decision tree classifier and many useful rules are left undiscovered by the decision tree techniques.However, goal of the classification rule mining is to discover a small set of rules in the database, the association rule technique will capture all possible rules in the database and generate too many rules; one the other hand, many useful rules are left undiscovered by the decision tree techniques. Medical data always contains numeric (continuous values) attributes; however, the association rule technique can not deal with numeric data directly and it is not an easy task to find out the appropriate way to discrete numeric attributes. Moreover, in order to neutralize drawbacks of these two mining techniques and use current commercial mining tools to analyze postoperative status of ovarian endometriosis patients to discover rules, we propose a concept to take the advantages of decision tree and association rule techniques to mine the data. In this paper, our goal is to investigate the efficacy of transvaginal aspiration and sclerotherapy with 95% ethanol retained in situ for the treatment of recurrent ovarian endometriomas. Moreover, although several researchers have performed statistical method to prove that aspiration followed by injection 95% ethanol left in situ (retention) is an effective treatment of ovarian endometriomas, very few of them discuss about the conditions that could generate better recovery rate for the patients. Therefore, this study adopts the statistical method and data mining techniques together to analyze postoperative status of ovarian endometriosis patients to discover such conditions.
顯示於類別:	[資訊工程學系暨研究所] 學位論文

文件中的檔案:

檔案	大小	格式	瀏覽次數
index.html	0Kb	HTML	342	檢視/開啟

在機構典藏中所有的資料項目都受到原著作權保護.

TAIR相關文章

資料載入中.....