NTCIR Project
Tools
xin2ntc.pl

[JAPANESE] [NTCIR Home] [NTCIR Tools Home]


xin2ntc.pl

This Document conversion script is script file that can convert the ducuments included in the provided Xinhua Chinese News Article Data into the NTCIR standard document format.

1 To obtain Xinhua Chinese News Article Data

For the non-participants, Xinhua Chinese News Article Data (1998-2001) for NTCIR Test Collection is available for research purpose use from the Linguistic Data Consortium (the LDC).

the Linguistic Data Consortium (the LDC):http://www.ldc.upenn.edu/

Chinese Gigaword (Xinhua Chinese News Article Data 1998-2001 for NTCIR-7 ACLIA and MOAT is included):

2 To convert the documents into the NTCIR standard document format

The documents in the obtained Corpus shall be converted into the NTCIR standard document format by the script xin2ntc.pl.
Script and README
http://aclia.lti.cs.cmu.edu/wiki/TaskDefinition?action=AttachFile&do=view&target=xin2ntc.pl