English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 62805/95882 (66%)
造訪人次 : 3993669      線上人數 : 290
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/35045


    題名: 決策樹中移除不相關值問題在醫療研究的運用
    其他題名: The irrelevant values problem in the decision tree for medical examinations
    作者: 黃南競;Huang, Nan-ching
    貢獻者: 淡江大學資訊工程學系碩士班
    葛煥昭;Keh, Huan-chao
    關鍵詞: 決策樹;不相關值問題;移除分支問題;醫學檢驗;Decision tree;classification;the irrelevant values problem;the missing branches problem;medical examination.
    日期: 2007
    上傳時間: 2010-01-11 05:56:46 (UTC+8)
    摘要: 隨著醫療資訊系統的廣泛使用使得資料庫中資料量的大量增加。因此我們若能從現有的病歷資料經由數據的分析找出各種病徵在某一特定的病症中的相關性從而歸納出它們相互間的必然性,則可幫助醫生在診斷時給於協助進而提升醫療品質。
    由於科技的進步、原先由手書寫方式的病歷改由以電腦儲存,近年更是由於軟硬體的進步,使得原先單純文字為主的病歷資料,進一步結合影像以及數位訊號等多媒體的資料型態,而成為多媒體醫學資料庫。無論是從病歷儲存到各種醫學影像或是生理訊號等屬於內含的資訊,藉此醫生更能有效的掌握病人的資料,對於臨床和基礎醫學研究都有相當大的正面意義,同時也可進一步的讓病人能夠得到更佳的醫療品質,基於以上的原因,歐美各國及日本先進國家無不對醫療資訊的整合系統進行廣泛的研究,目前國內外大部分的醫療體系也都建立的專屬的資料庫管理系統,以加速病患、醫師、與醫院間資訊的流通。
    在資料探勘技術裡,決策樹中不相關值問題將會是本文討論的重點。當使用一組規則來代表一決策樹時,個別規則的先決條件可能含有不相關的狀況。當我們將這些規則應用在醫療檢驗時,這些不相關的狀況可能造成病人與社會不必要的負擔。因此為避免產生含有不相關狀況的規則,我們提出一個新的演算法。根據決策樹上的資訊,在轉換決策樹的過程中移除規則的不相關狀況。我們的演算法不只能處理不連續值,同時也可以處理連續值。
    The decision tree is one of the key data mining techniques and has been applied to medical applications. A decision tree is built up by selecting the best test attribute as the root of the decision tree. Then, the same procedure is operated on each branch to induce the remaining levels of the decision tree until all examples in a. leaf belong to the same class. However, since the decision tree creates a branch for each value of that appearing in the training data without considering whether the value is relevant to the classification, the resultant tree may have over-specialization problem. Without losing generality, we only consider ID3-like algorithm in this paper.
    As pointed out by J. Cheng, the irrelevant values problem and the missing branches problem are two causes of over-specialization of the decision tree. The missing branches problem of the decision tree is due to the fact that some of the reduced subsets at the non-leaf nodes do not necessarily contain examples of every possible value of the branching attribute. Consequently, the decision tree may fail to classify some instances. Since some values of that attribute may not be relevant to the classification, the resultant rules of the decision tree may have irrelevant conditions, which demands extra information to be supplied. Extra information needed means extra examinations needed to a patient, and extra examinations cause more expense and more burdens to the patient and society. When the decision tree is applied to medical applications, to save medical resources and avoid unnecessary examinations, we have to deal with irrelevant conditions in the decision tree.
    When a decision tree is represented by a collection of rules, the antecedents of individual rules may contain irrelevant conditions. When we apply these rules to medical examinations, these irrelevant conditions may cause unnecessary burden to the patient and the society. Therefore, to avoid generating rules with irrelevant conditions, we propose a new algorithm to remove irrelevant conditions of rules in the process of converting the decision tree to rules according to information on the decision tree. Our algorithm can handle not only discrete values, but also continuous values.
    顯示於類別:[資訊工程學系暨研究所] 學位論文

    文件中的檔案:

    檔案 大小格式瀏覽次數
    0KbUnknown424檢視/開啟

    在機構典藏中所有的資料項目都受到原著作權保護.

    TAIR相關文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回饋