NTCIR (NII Test Collection for IR Systems) Project NTCIRCONTACT INFORMATIONNII
NTCIR HOME

NTCIR-6 HOME
NTCIR-6 MEETING
TASK DESCRIPTION
・TASK INFORMATION
CLIR
CLQA
PATENT
QAC
PILOT TASK
MuST
HOW TO PARTICIPATE
DATA
IMPORTANT DATES
USER AGREEMENT FORMS
CONTACT INFORMATION
MAILING LISTS
FINAL MEETING
・ONLINE PROCEEDINGS

NTCIR HOME

The 6th NTCIR Workshop

DATA

NTCIR-6 is over. For information on data see the NTCIR data page.

[Japanese]

NTCIR-6 Test Collections: Documents

The following documents collections are used for the 6th NTCIR Workshop. They are available for the participating research groups free of charge for the task participation and system evaluation within the 6th NTCIR Workshop. To obtain the data, the signed user agreement forms must be submitted to the NTCIR Project Office at the NII.

task test collection documents
genre language file name number of documents (size) year
CLIR NTCIR-3 CLIRNTCIR-4 CLIR
news articles Chinese (traditional) CIRB020 <*>
(United Daily News)
249,508 1998-1999
CIRB011 (China Times, China Times Express, Commercial Times, China Daily News, Central and daily News) 132,173
Korean Hankookilbo<*> 149,498
Chosunilbo<*> 104,517
Japanese Mainichi 220,078
Yomiuri<*> 375,980
NTCIR-5 CLIR,
NTCIR-6 CLIR
news articles Chinese (traditional) CIRB040( United Daily News, United Express, Ming Hseng News, Economic Daily News) 901,446 2000-2001
Korean Hankokookilbo 85,250
Chosunilbo 135,124
Japanese Mainichi 199,681
Yomiuri 658,719
CLQA NTCIR-5 CLQA,
NTCIR-6 CLQA
news articles Chinese (traditional) CIRB040 901,446 2000-2001
Japanese Yomiuri 658,719
English Daily Yomiuri 17,741
NTCIR-6 CLQA Korean? under consideration
PATENT NTCIR-3 PATENT<+>

separate user agreement form is needed
patent full Japanese Publication of unexamined patent applications 697,262
(18,139MB)
1998-1999
patent abstract Japanese Patent Abstracts (J-sho) 1,706,154
(1,883MB)
1995-1999
patent abstract English Patent Abstracts of Japan (PAJ) 1,701,339
(2,711MB)
NTCIR-4 PATENT<+>,
NTCIR-5 PATENT
<+> separate user agreement form is needed
patent full Japanese

Publication of unexamined patent applications

3,496,252
( 94.5GB )
1993-2002
patent abstract English

Patent Abstracts of Japan (PAJ)

3,496,252
( 5,482MB )
NTCIR-6 patent full English USPTO Patent
QAC NTCIR-3 QA,
NTCIR-4 QA
news articles Japanese Mainichi 220,078 1998-1999
Yomiuri<*> 375,980
NTCIR-5 QA, NTCIR-6 QA news articles Japanese Mainichi 199,681 2000-2001
Yomiuri 658,719

1: For the details of the task data (topics and relevance judgments, questions and answers, summaries, etc), please consult the CFPs of each task.

2: For the column for NTCIR-3 and -4, <*> marked data was not included for NTCIR-3 and used only NTCIR-4.

3: For the data with <+>, the separate user agreement forms for research purpose use are needed. Please consult NTCIR Data Home

4: Please notice that the document collections shall be used for the purpose of accomplishing tasks set out in the NTCIR Workshop 6 and for the purpose of research related to the tasks. The documents can not be used for "information purpose".