Salta al contenuto principale
Passa alla visualizzazione normale.

ROBERTO PIRRONE

Semantic sense extraction from Wikipedia pages

Abstract

This paper presents a technique aimed to extract structured information from unstructured Wikipedia contents related to a particular topic, and to arrange it in a semantic way inside an ontology. The general framework is the design of an artificial agent able to deliberate when increasing its domain knowledge. In particular, this cognitive agent acts as a dialogue manager in an Intelligent Tutoring System (ITS) already presented by the authors. Our approach is based on the definition of useful patterns able to extract and identify novel concepts and relations to be added to the knowledge base. We propose a method that uses information from the wiki page’s structure. We define different strategies to obtain new concepts, and relations according to the different parts of the page. Each page is processed also as regards the text in each section. Structure analysis allows the system to extract concepts and their general relations, while text analysis is useful to devise the type of each relation to be incorporated in the domain ontology.