[JAPANESE] [NTCIR Home] [NTCIR Tools Home]
This Document conversion script is script file that can convert the ducuments
included in the provided Xinhua English News Article Data into the NTCIR
standard record format.
1 To obtain Xinhua English News Article Data
For the non-participants, Xinhua English News Article Data (1998-2001)
for NTCIR Test Collection is available for research purpose use from the Linguistic Data Consortium (the LDC).
the Linguistic Data Consortium (the LDC):http://www.ldc.upenn.edu/
2 To convert the documents into the NTCIR standard record format
The document records in the purchased Corpus shall be converted into the
NTCIR standard document format by the scripts xie2ntc.pl and xie2ntc2.pl.
Script: xie2ntc2.pl (For Xinhua English 98-99, 00-01)
http://research.nii.ac.jp/ntcir/tools/xie2ntc2.pl_txt
Script: xie2ntc.pl (For Xinhua English 98-99)
http://research.nii.ac.jp/ntcir/permission/ntcir-4/script/xie2ntc.pl_txt
README
http://research.nii.ac.jp/ntcir/permission/ntcir-4/script/READMEforXinhuaScript.txt