淡江大學機構典藏:Item 987654321/34959
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 62805/95882 (66%)
造访人次 : 3902964      在线人数 : 355
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/34959


    题名: Text multi-categorization method based on fuzzy correlation analysis
    其它题名: 以模糊相關分析為基礎之文件多重分類方法
    作者: 闕豪恩;Chueh, Hao-en
    贡献者: 淡江大學資訊工程學系博士班
    林丕靜;Lin, Nancy Pei-ching
    关键词: 文件多重分類;模糊簡單相關分析;模糊半淨相關分析;Text Multi-Categorization;Fuzzy Simple Correlation Analysis;Fuzzy Semi-Partial Correlation Analysis
    日期: 2007
    上传时间: 2010-01-11 05:49:33 (UTC+8)
    摘要: 文件多重分類,是一種根據文件本身的內容,將未分類的文件適當地歸類到一個或一個以上預先設定類別的過程。由於未分類文件的內容可能涉及多個不同的主題,因此文件多重分類的方式是合理的。
    為了能合理地將未分類的文件進行多重分類,在本論文中提出一個以模糊相關分析為基礎之文件多重分類。模糊簡單相關 (Fuzzy Simple Correlation) 分析是一種用以分析兩個模糊屬性之間是否存線性關係的有效工具,而本論文所提出的方法,可用來協助了解未分類文件與各預先設定類別的關聯性。但由於在大部分的文件分類過程中,各預設類別間可能存在相關性,因此在分析一篇未被分類之文件與某個預先設定類別間的關連性時,則必須考慮到與此類別有關聯性之其他類別所可能產生的影響。因此,本論文中推導出另一種重要的模糊相關分析,稱之為模糊半淨相關(Fuzzy Semi-Partial Correlation)分析,並將此種模糊相關分析應用於處裡此種情形。
    本論文所提出方法的主要架構,是在分類過程中逐步分析出,對於未分類文件的內容能夠提供顯著解釋(說明)能力的預先設定類別。而此類別即是在每一個分析步驟中,與未分類文件之間具有最大模糊半淨相關係數者。根據模糊簡單相關係數及模糊半淨相關係數的特性,以及利用顯著性分析的檢定,便可從所有預先設定類別中擷取出與未分類文件具有最顯著正關聯性的類別,也可依此將未分類的文件適當地歸類到一個或一個以上的預設類別中。
    Text multi-categorization is the procedure that each unlabeled text document can be assigned into more than one appropriate category according to its content. Because content of an unlabeled text document may be involved in different issues, this kind of text categorization procedure, text multi-categorization, seems reasonably.
    To assign an unlabeled text document into more than one appropriate category, a novel text multi-categorization method based on fuzzy correlation analysis is proposed in this thesis. A fuzzy simple correlation analysis show the strength and the direction of linear relationship between two fuzzy attributes, which is useful for us to analyze the relationships between the unlabeled text documents and the predefined categories. But, in a text categorization procedure, there may be a relationship between the predefined categories. Effects of other predefined categories may influence the relationship between the observed text document and the objective predefined category. Thus, a fuzzy semi-partial correlation analysis which examining the relationship between two fuzzy attributes when the influences of other fuzzy attributes are removed is used together with the fuzzy simple correlation analysis to construct a new text multi-categorization method in this dissertation.
    The main concept of our proposed method is to find the predefined categories which can significantly describe (explain) the content of an unlabeled text document, step by step. The category we choose at each step is the category with the largest fuzzy semi-partial correlation between itself and the unlabeled text document after remove the influences of the already assigned categories. According to the properties of these fuzzy correlation coefficients, and by using the test of significance, we can find the categories with the most positive relationships to an unlabeled text document, and thus assign these appropriate categories to the document.
    显示于类别:[資訊工程學系暨研究所] 學位論文

    文件中的档案:

    档案 大小格式浏览次数
    0KbUnknown415检视/开启

    在機構典藏中所有的数据项都受到原著作权保护.

    TAIR相关文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回馈