Past Issues

Studies in Informatics and Control
Vol. 10, No. 1, 2001

Document Indexing Using Named Entities

Rada Mihalcea, Dan I. Moldovan

Current text indexing and retrieval techniques have their roots in the field of Information Retrieval where the task is to extract dowments that best match a query. With an ever increasing number of documents available due to the easy access through the Internet, the challenge is to provide users with concise and relevant information. We are proposing here a novel, yet simple approach, which indexes the named entities in the documents, such as to improve the relevance of documents retrieved. Experiments performed in finding information related to a set of 75 input questions, from a large collection of 125,000 documents, show that this new technique reduces the number of retrieved documents by a factor of 2, while still retrieving the relevant documents.


information retrieval. semantic indexing, question answering

View full article