English  |  正體中文  |  简体中文  |  Items with full text/Total items : 60696/93562 (65%)
Visitors : 1038864      Online Users : 33
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/74585

    Title: 基於中文斷詞技術之新聞網頁分類系統
    Other Titles: Automatic news pages classification system based on chinese word segmentation
    Authors: 林孟翰;Lin, Meng-Han
    Contributors: 淡江大學資訊工程學系碩士班
    蔡憶佳;Tsai, Yih-Jia
    Keywords: 貝氏分類法;查全率;Naive Bayes Classifier;recall rate
    Date: 2011
    Issue Date: 2011-12-28 18:57:49 (UTC+8)
    Abstract: 近年來隨著網路的發展,網路已經是人們生活中不可缺少的一部份,利用網路的便利性與互動性,可以使網路使用者知道近期內所發生的事情,也因為網路擁有這些特性,使得新聞資訊成長非常的快速。然而這樣的狀況衍生了一個問題,如何讓網路使用者能夠得知正確或是相關的訊息則是當下不得不面對的重要問題。
    With the vigorous development of the Internet, network is becoming indispensable to many people’s everyday life. Due to the convenience of reading news from the network, the number of users learning recent events from the Internet is growing rapidly. This also caused a large number of news agencies made their news available on the network. Thus, how to enable users receive relevant or interested news is an important issue. One way is to build an automatic news classification system that allows users to read from different categories of their interests.
    In this paper, a news page classification system based on Chinese word segmentation is set up. It can automatically download news pages and use the n-gram algorithm for word segmentation. After word segmentation, we compare the performance of two classification schemes. Naïve Bayes classifier has higher recall rate, average recall rate is 71%. Experimental results show that Naïve Bayes classifier with n-gram for word segmentation has a better performance over.
    Appears in Collections:[Graduate Institute & Department of Computer Science and Information Engineering] Thesis

    Files in This Item:

    File SizeFormat

    All items in 機構典藏 are protected by copyright, with all rights reserved.

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback