Document Indexing Using Named Entities

Studies in Informatics and Control

Vol. 10, No. 1, 2001

Rada Mihalcea, Dan I. Moldovan

Abstract

Current text indexing and retrieval techniques have their roots in the field of Information Retrieval where the task is to extract dowments that best match a query. With an ever increasing number of documents available due to the easy access through the Internet, the challenge is to provide users with concise and relevant information. We are proposing here a novel, yet simple approach, which indexes the named entities in the documents, such as to improve the relevance of documents retrieved. Experiments performed in finding information related to a set of 75 input questions, from a large collection of 125,000 documents, show that this new technique reduces the number of retrieved documents by a factor of 2, while still retrieving the relevant documents.

Keywords

information retrieval. semantic indexing, question answering

View full article

CITE THIS PAPER AS:

Rada Mihalcea, Dan I. Moldovan, "Document Indexing Using Named Entities", Studies in Informatics and Control, ISSN 1220-1766, vol. 10(1), pp. 21-28, 2001.

Past Issues