個別化網頁文件自動分類

機構典藏 > College of Engineering > Graduate Institute & Department of Computer Science and Information Engineering > Thesis > Item 987654321/52356

Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/52356

Title:	個別化網頁文件自動分類
Other Titles:	Individualized automatic classification of web documents
Authors:	陳冠宇;Chen, Kaun-yu
Contributors:	淡江大學資訊工程學系碩士在職專班蔡憶佳;Tsai, Yih-jia
Keywords:	自動分類;automatic classification
Date:	2010
Issue Date:	2010-09-23 17:34:21 (UTC+8)
Abstract:	隨著網路時代的來臨，透過網路閱讀新聞已成為大眾獲取資訊的重要來源。目前各大網站都使用自訂的分類方法顯示新聞，如此卻常造成分類不足或是分類太細的情況。因此，本研究提出一個以內容為基礎的新聞分類方法，實作可依個人需求設定分類的新聞自動分類器。該自動分類器經過訓練後，可將新聞內容加以分析，再以個人化的分類方法重新顯示。此外，為了要能立即取得並分析最新的新聞，故本研究使用計算速度較快的「單純貝式」分類方法來預測分類結果，並使用搜尋引擎和網上信息挖掘學術研討會(Symposium of Search Engine and Web Mining，SEWM) 2006年所提供的中文新聞資料庫進行實驗。最後，本研究經前述方法實證後，新聞分類的查全率可到82%以上，查準率可達96%以上。 With the advent of the Internet age, reading news from the internet has become an important source of getting information. At present, major websites use their own classification method to display the news, but that usually is insufficient or produces too many classifications. For this reason, this research brings up a classification method basing on contents of news, and implements an automatic classifier according to personal demands. After the automatic classification system being trained, it can analyze the contents of news and then display them afresh with personal classification methods. Besides, in order to obtain and analyze the latest news, this research uses the Naïve Bayes classification method which calculates faster to predict the results of classification, and uses a search engine and the Chinese news database provided by Symposium of Search Engine and Web Mining(SEWM) in 2006 to perform the experiment. Finally, by practically operating this research with the methods mentioned above, the recall rate of news classification can reach a result of more than 82%, and the precision rate can reach a result of more than 96%.
Appears in Collections:	[Graduate Institute & Department of Computer Science and Information Engineering] Thesis

Files in This Item:

File	Size	Format
index.html	0Kb	HTML	442	View/Open

Loading...