[JAPANESE] [NTCIR Home] [NTCIR DATA Home]
The task's goal is the classification of research papers written in
either Japanese or English in terms of the International Patent
Classification (IPC) system, which is a global standard hierarchical
patent classification system. This test collection is intended to
evaluate the following four different subtasks.
Collection | Task | Documents | Task data | |||||||
Genre | Filename | Lang. | Year | # of docs | Size | Topic/ | Relevance judge |
|||
Lang. | # | |||||||||
NTCIR-7 PATMN | MINING | patent full-text | Publication of unexamined patent applications | J | 1993-2002 | 3,496,252 | 94.5GB | J | Japanese/ |
2 |
sci. abstract | ntc1-je | JE | 1988-1997 | 339,483 | 577MB | |||||
ntc1-j | J | 332,918 | 312MB | |||||||
ntc1-e | E | 187,080 | 218MB | |||||||
ntc2-j | J | 1986-1999 | 400,248 | 600MB | ||||||
ntc2-e | E | 134,978 | 200MB | |||||||
patent abstract | Patent Abstracts of Japan (paj) | E | 1993-2002 | 3,496,252 | 5,482MB | E | English/ |
2 | ||
patent full-text | Patent grant data published from USPTO | E | 1993-2002 | 1,315,470 | 52.6 GB | |||||
sci. abstract | ntc1-je | JE | 1988-1997 | 339,483 | 577MB | |||||
ntc1-j | J | 332,918 | 312MB | |||||||
ntc1-e | E | 187,080 | 218MB | |||||||
ntc2-j | J | 1986-1999 | 400,248 | 600MB | ||||||
ntc2-e | E | 134,978 | 200MB |
* The entire collection is provided by NII for research purposes.
Publication of unexamined patent applications |
By sending DVD-ROMs: NTCIR-4 PATENT and NTCIR-5 PATENT , or transferring
the data files electronically.
NTCIR-4 PATENT: unexamined Japanese patent application published in 1993-1997 NTCIR-5 PATENT: unexamined Japanese patent application published in 1998-2002 |
|
ntc1-je ntc1-j ntc1-e |
By sending CD-ROM:NTCIR-1Test Collection | |
ntc2-j ntc2-e |
By sending CD-ROM:NTCIR-2 Test Collection | |
Patent Abstracts of Japan (paj) | By sending DVD-ROM: NTCIR-4/5 PATENT, or transferring the data files electronically.
NTCIR-4/5 PATENT: Patent Abstracts of Japan published in 1993-2002 |
|
Patent grant data published from USPTO |
By sending DVD-ROMs: NTCIR-6 PATENT, or transferring the data files electronically.
NTCIR-6 PATENT: patent grant data published from USPTO in 1993-2002 |
|
ntc1-je ntc1-j ntc1-e |
By sending CD-ROM:NTCIR-1Test Collection | |
ntc2-j ntc2-e |
By sending CD-ROM:NTCIR-2 Test Collection |
Patent Abstracts of Japan 1993-2002
The Patent Abstracts of Japan (PAJ) are translations of the JAPIO
Patent Abstracts, which are edited manually on the basis of summaries
in source applications.
USPTO patent grant data 1993-2002
This document set consists of patent grant data published in 1993-2002 from the U.S.Patent & Trademark Office (USPTO).
NTCIR-1 CLIR task test collection 1998-1997
This document set consists of author abstracts of papers presented at the academic conference hosted by either of 65 academic societies in 1988-1997.
NTCIR-2 CLIR task test collection 1986-1999
This document set consists of additional author abstracts of the
academic conference paper database in 1997-1999, and Grant Reports in
1988-1997.
(1) Japanese Subtask / Cross-lingual Subtask (J2E)
Search Topics
Each search topic is a title and an abstract of a research paper written in Japanese,
and the
total number of search topics is 978.
Relevance judgment
The 978 topics are divided into two groups: group A, in which highly
relevant IPC codes are assigned to 525 topics, and group B, in which
relevant IPC codes are assigned to 451topics.
(2) English Subtask / Cross-lingual Subtask (E2J)
Search Topics
Each search topic is a title and an abstract of a research paper written in English,
and the
total number of search topics is 978.
Relevance judgment
The 978 topics are divided into two groups: group A, in which highly
relevant IPC codes are assigned to 525 topics, and group B, in which
relevant IPC codes are assigned to 451topics.
The followings are the procedures to obtain the test collection. The test collection and data available from NII are free of charge.
NTCIR Project (Rm.1309)
National Institute of Informatics
2-1-2 Hitotsubashi Chiyoda-ku, Tokyo
102-8430, JAPAN
PHONE: +81-3-4212-2750
FAX: +81-3-4212-2751
Email: ntc-secretariat
The test collection has been constructed and used for the NTCIR. They are
usable only for the research purpose use.
The documents collection included in the test collection were provided
to NII for used in NTCIR free of charge or for a fee. The providers of
the document data kindly understand the importance of the test collection
in the research on information access technologies and then granted the
use of the data for research purpose. Please remember that the document
data in the NTCIR test collection is copyrighted and has commercial value
as data. It is important for our continued reliable and good relationship
with the data producers/providers that we researchers must behave as a
reliable partners and use the data only for research purpose under the
user agreement and use them carefully not to violate any rights for them
.