# This page is available English only.
1. INTRODUCTION
"Xinhua English Text (1998-2001)"(English Dataset for Formal Run), or "LDC2006E106: NTCIR Opinion Annotation Pilot Task (Xinhua Text)", is available for the participants of the NTCIR-9 VisEx Task only for
the purposes of the NTCIR Workshop.
The document data will be provided to you by the LDC via internet server for download.
Xinhua English Text (1998-2001) is also included in either of the following
LDC corpora:
LDC2003T05: English Gigaword First Edition, which released on June 28, 2003.
LDC2005T12: English Gigaword Second Edition, which released on Jul
15, 2005.
LDC2007T07: English Gigaword Third Edition, which released on May
17, 2007.
LDC2009T13: English Gigaword Fourth Edition, which released on May
22, 2009
If you have one of the above four, you do not need to newly obtain the
corpus.
This is only a portion of the data for the NTCIR-9 VisEx Task. The rest
of the data can be obtained directly from NTCIR after filling out and sending
the two forms below.
2. HOW TO OBTAIN THE DATA
- (1) Register to participate in the VisEx task at NTCIR-9
- The LDC will grant the license to the registered participants.
- (2) Download the LDC's "NTCIR-9 VisEX Evaluation Agreement"
- (3) Complete and sign the agreement.
- (4) Fax or scan and email a signed agreement to the Linguistic Data Consortium
(LDC).
-
- Fax:+1(215)573-2175
- Email: ldc
- ATTN: Ms Ilya Ahtaridis, Membership Coordinator
- (5) The document data will be provided to you by the LDC.
- Contacting LDC:
- Linguistic Data Consortium
- 3600 Market Street
- Suite 810
- Philadelphia, PA, 19104-2653, USA
- General Office Telephone:+1(215)898-0464
- @@@@Membership Office Telephone: +1(215) 573-1275
- Fax:+1(215)573-2175
- Email: ldc
- ATTN: Ms Ilya Ahtaridis, Membership Coordinator
3. SCOPE OF THE LICENSE
This license is valid until September 30, 2012 only, at which time User
agrees to delete the Data and any files and software derived from it from
any computer or media onto which it has been copied and to return all media
to the LDC. User may keep the data by agreeing to pay the LDC the non-member fee and
signing the generic LDC nonmember user agreement.
For the detailed conditions, please consult the agreement.
4. CONVERSION OF LDC DOCUMENT DATA INTO NTCIR FORMAT
The documents in the obtained Corpus shall be converted into the NTCIR
standard document format by the script xie2ntc.pl.