Word sense disambiguation using semantic relatedness measurement

機構典藏 > College of Engineering > Graduate Institute & Department of Computer Science and Information Engineering > Thesis > Item 987654321/35233

Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/35233

Title:	Word sense disambiguation using semantic relatedness measurement
Other Titles:	計算語意相似度之方法及其應用於字義排歧
Authors:	楊哲宇;Yang, Che-yu
Contributors:	淡江大學資訊工程學系博士班施國琛;Shih, Timothy K.
Keywords:	觀念表示;語意相似度;字義排歧;詞網;自然語言處理;資訊擷取;concept representation;semantic relatedness;word sense disambiguation;Wordnet;natural language processing;information retrieval
Date:	2006
Issue Date:	2010-01-11 06:14:30 (UTC+8)
Abstract:	將相似度或相關度這種直覺上的見解做一正規化及量化，長久以來在哲學、心理學、人工智慧等領域有著常駐的興趣，許多不同的觀點也都被提出。而這種決定兩個以詞彙表示的概念在語意上的相似度的工作，或者更一般性的來說 — 相關度，可以被運用到與多不同的地方。所有人類語言中的詞彙，都可能因為出現在不同的前後文裡面，而代表著不同的意涵，而這種擁有多重字義的詞彙，潛在著語意不清的問題。對於幾乎所有與人類語言相關的應用領域，這種語意不清往往成為錯誤的來源。而字義排歧便是決定一個詞彙出現在某個前後文中，其所帶有的字義的一種工作。多形字 — 一個帶有多重字義的詞彙，以及同義字 — 多個代表著相同字義的不同詞彙，對於自然語言處理或人工智慧相關的領域都是非常重要的課題。在資訊擷取相關的領域，多形字造成準確率的降低，而同義字造成召回率的降低。在本論文中，我們提出了一套新穎的混合式方法，來測量任意兩個概念在人類語意上的相關度，並將此方法應用到字義排歧的工作上。此外，我們也研究了如何利用字義排歧，來克服多形字及同義字的問題，以提高資訊擷取系統的效能。這個論文不但從理論上的角度來研究概念表達、概念分佈以及語意相關度，並且也思考如何實際利用這些理論，來幫助意義排歧及資訊擷取。 The problem of formalizing and quantifying the intuitive notion of similarity or relatedness has a long history in philosophy, psychology, and artificial intelligence, and many different perspectives have been suggested. The need to determine the degree of semantic similarity, or more generally, relatedness, between two lexically expressed concepts is applied in many applications. All human languages have words that can mean different things in different contexts, such words with multiple meanings are potentially “ambiguous”. For almost all applications of language technology, word sense ambiguity is a potential source of error. “Word Sense Disambiguation (WSD)” is the process of deciding which of their several meanings is intended in a given context. Polysemy — a single word form having more than one meaning; synonymy — multiple words having the same meaning, are both important issues in natural language processing or artificial intelligence related fields. In information retrieval field, polysemy decreases retrieval precision by false matches; on the other hand, synonymy decreases the recall by missing true conceptual matches. In this thesis, we explore the measures of semantic relatedness between word senses based on a novel hybrid approach, and we apply the measure of semantic relatedness to the WSD task. Beside, we also investigate how WSD can benefit the task of information retrieval by solving the problems of polysymy and synonymy. This research is not only from a theoretical perspective on concept representation, concept distribution and semantic relatedness, but also considered possible applications of the proposed theory on word sense disambiguation and information retrieval.
Appears in Collections:	[Graduate Institute & Department of Computer Science and Information Engineering] Thesis

Files in This Item:

File	Size	Format
	0Kb	Unknown	726	View/Open

Loading...