Overview

NTCIR Workshop

go to NTCIR Project Website

The NTCIR Workshop is a series of evaluation workshops designed to enhanceresearch in information access technologies including Japanese and Asianlanguage text retrieval, cross-lingual information retrieval, automatictext summarization, extraction, question answering, etc.

The aims are;

to encourage research in information access technologies by providing large-scale test collections reusable for experiments and a common evaluation infrastructure allowing cross-system comparisons
to provide a forum for research groups interested in cross-system comparison and exchanging research ideas in an informal atmosphere
to investigate evaluation methods of information access technologies and methods for constructing a large-scale test collections reusable for experiments.

An evaluation workshop usually provides a set of data usable for experimentsand unified evaluation procedures for experiment results. Each participatinggroup conducts research and experiments using the data provided by theNTCIR organizer with various approaches. The importance of reusable large-scalestandard test collections in IR and text processing technologies researchhas widely been recognized and an evaluation workshop is now recognizedas a new style of active research project that facilitates research byproviding data and a forum for research idea exchange and technology transfer.

For the First NTCIR Workshop, the process was started from November, 1998, and the Workshop meeting was held on August 30 - September 1, 1999, at KKR Hotel, Tokyo. Twenty-eight groups from six countries conducted the tasks and submitted the results for the first workshop. For the Second Workshop, the process was started from June 2000 and the meeting will be held on March 7-9, 2001, NII, Tokyo and thirty-six groups from eight countries have conducted the tasks and submitted the results. For the Third Workshop, the process started from August 2001 and the meeting will be held on October 8-10, 2002, NII, Tokyo and more than 70 groups from nine countries conducted the tasks and submitted the results.

The Third NTCIR Workshop hosts five tasks, (1) Cross Language InformationRetireval of Chinese, Korean, Japanese and English documents (CLIR), (2)Patent Retrieval including cross-genre retrieval of search patents by newspaperarticles and cross language retreival of search Japanese patents by English,Chinese or Korean topics using 2 years of Japanese Patent Documents (ca.17GB) and 5 years of English and Japanese exactly translated paired abstracts(PATENT), (3) Question Answering to extract noun phrases to express theanswers to the question (QAC), (4) Automatic Text Summariztion includingmultiple document summarization (TSC), and (5) Web Retrieval using 10GBor 100GB document collections of a snap shot of the World Wide Web (WEB).

From the beginning of the NTCIR project, we have looked at both traditionallaboratory-type IR system testing and evaluation of more challenging technologies.For the laboratory-type testing, we have placed emphasis on (1) informationretrieval (IR) with Japanese or other Asian languages and (2) cross-lingualinformation retrieval. For the challenging issues, (3) shift from documentretrieval to "information" retrieval and processing, and (4)investigation for realistic evaluation, especially evaluation methods suitablefor retrieval and processing of particular document-genre and its usageof the user group of the genre and so on.

The resutls of the research done by each active participating groups willbe reported in the meeting on Oct. 8-10, 2002. Please come and join thediscussion!