English  |  正體中文  |  简体中文  |  Items with full text/Total items : 49287/83828 (59%)
Visitors : 7153317      Online Users : 51
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/37602


    Title: Using the keyword in context segmentation method for a Chinese website
    Authors: Chen, Jui-Fa;Lin, Wei-Chuan;Jian, Chih-Yu;Ho, Tzong-Yuh;Dai, Shi-Yao
    Contributors: 淡江大學資訊工程學系
    Keywords: 中文斷詞;動態網站;加權計算系統;上下文比較系統;Chinese Segmentation;Dynamic Website;Weight Calculation System;Context Comparing System
    Date: 2003-04-24
    Issue Date: 2010-01-11 13:46:11 (UTC+8)
    Abstract: 網路科技的持續發展,使得靜態的網站漸漸不能符合使用的需求;一個主動及互動的機制網站成為目前的基本要求;為提供主動及互動的機制,各式各樣的網路代理人(Agent)技術便因應而生。中文不同於英文,他並沒有以空白分開每個詞彙,這使得中文斷詞比英文斷詞來得困難很多。所以,為了幫助代理人處理使用者輸入的語句和分析語意,一個快速而正確的斷詞方法是必要的。本論文由前後文的角度去探討詞彙切割與詞類定義的方法,由於本斷詞系統的主要判斷方式採用前後文的方法關係,比起光用語料庫及構詞原則的斷詞方式在語意上具有更高的正確性,比較不會出現語意上的錯誤,而為了使系統在斷詞上不會因為未知詞造成太多無謂的錯誤,我們根據各網站需要使用對應的專業語料庫。對於較簡單、隨意的口語對話本方法亦有相當高之正確率,本論文中將介紹這個斷詞方法,並提出我們的斷詞實驗結果,可證明本論文所提的方法有較高的正確性。
    With the continuous developing of network technology, the static website is not enough to response the user request. A dynamic website becomes the basic requirement of today's network because of the interactive mechanism. For providing the dynamic and interactive mechanism, a website should apply some kinds of agent technology. Due to the Chinese language does not use the space to segment the lexical entry, the segmentation of Chinese language for an interactive website is more difficult than English. This paper proposes a segmentation method and part of speech (POS) definition from the keyword in the context of a Chinese sentence. This method preprocesses the input sentences to analyze what the user wants. The results can be used as the basis to response appropriate message to the user. Because the proposed method is to use the keyword relationship in the grammar of the input context, it has a higher correctness in meanings than that of which only uses the corpora and the word- building principle. To avoid in making too many mistakes in segmentation for the unknown lexical entries, the proposed method uses the corresponding professional corpora according to each kind of website. The implementation results can offer the evidence that the proposed method can provide higher correctness for succinct and colloquial language conversation.
    Relation: 第11屆國際電腦輔助教學研討會ICCAI2003暨第16屆中華民國電腦輔助教學研討會論文集=Proceedings 2003 Internaitonal Conference on Computer-Assisted Instruction,6頁
    Appears in Collections:[資訊工程學系暨研究所] 會議論文

    Files in This Item:

    There are no files associated with this item.

    All items in 機構典藏 are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback