NTCIR Project
Tools
xie2ntc.pl

[JAPANESE] [NTCIR Home] [NTCIR Tools Home]


xie2ntc.pl

This Document conversion script is script file that can convert the ducuments included in the provided Xinhua English News Article Data into the NTCIR standard record format.

1 To obtain Xinhua English News Article Data

For the non-participants, Xinhua English News Article Data (1998-2001) for NTCIR Test Collection is available for research purpose use from the Linguistic Data Consortium (the LDC).

the Linguistic Data Consortium (the LDC):http://www.ldc.upenn.edu/

2 To convert the documents into the NTCIR standard record format

The document records in the purchased Corpus shall be converted into the NTCIR standard document format by the scripts xie2ntc.pl and xie2ntc2.pl.


Script: xie2ntc2.pl (For Xinhua English 98-99, 00-01)
http://research.nii.ac.jp/ntcir/tools/xie2ntc2.pl_txt
Script
: xie2ntc.pl (For Xinhua English 98-99)
http://research.nii.ac.jp/ntcir/permission/ntcir-4/script/xie2ntc.pl_txt

README
http://research.nii.ac.jp/ntcir/permission/ntcir-4/script/READMEforXinhuaScript.txt