# This page is available English only.
1. INTRODUCTION
"New York Times Text (2002-2005)" and "Xinhua English Text (1998-2001)"(English Dataset for Formal Run), or "LDC2009E74 NTCIR-9 New York Times Data 2002-2005" and "LDC2006E106: NTCIR Opinion Annotation Pilot Task (Xinhua Text)", are available for the participants of the NTCIR-9 GeoTime Task only for the purposes of the NTCIR Workshop.
The document data will be provided to you by sending a DVD-ROM from the
LDC. If you will obtain these documents from the LDC, you will be asked
to pay $US50 to cover some portion of costs in preparing and shipping the
data.
New York Times Text (2002-2005) is also included in the following LDC corpus:
LDC2007T07: English Gigaword Third Edition, which released on May 17, 2007.
Xinhua English Text (1998-2001) is also included in either of the following LDC corpus:
LDC2003T05: English Gigaword First Edition, which released on June 28, 2003.
LDC2005T12: English Gigaword Second Edition, which released on Jul
15, 2005.
LDC2007T07: English Gigaword Third Edition, which released on May
17, 2007.
LDC2009T13: English Gigaword Fourth Edition, which released on May
22, 2009
If you have one of the above, you do not need to newly obtain the corpus.
This is only a portion of the data for the NTCIR-9 GeoTime Task. The rest
of the data can be obtained directly from NTCIR after filling out and sending
the two forms below.
2. HOW TO OBTAIN THE DATA
-
A. Sign the License Agreement:
(1) Register to participate in the GeoTime task at NTCIR-9
The LDC will grant the license to the registered participants.
(2) Download the LDC's "NTCIR-9 GeoTime Evaluation Agreement"
(3) Complete and sign the agreement.
(4) Fax or scan and email a signed agreement to the Linguistic Data Consortium
(LDC).
-
- Fax:+1(215)573-2175
- Email: ldc
- ATTN: Ms Ilya Ahtaridis, Membership Coordinator
B. Making payment
Payment can be made in one of three ways:
1. with a check from a bank with branches in the United States
For credit to The Trustees of the University of Pennsylvania.
2. with a wire to:
-
- Wachovia Bank NA
123 South Broad Street
Philadelphia, PA 19109
ABA NO. 031201467
Account No. 2000018692644
SWIFT CODE: PNBPUS33PHL
- For credit to The Trustees of the University of Pennsylvania
- Attn:Ms. Ilya Ahtaridis +1(215) 573 1275
3. with Visa or MasterCard
Please provide the following:
- 1. Type of credit card
- 2. Credit card number
- 3. Expiration date
- 4. Credit card billing address
Please mail checks to the LDC, but note that they should credit 'the Trustees
of the University of Pennsylvania'.
For security purposes, please do not provide credit card details by email.
It is recommended that you call the LDC at +1(215) 573-1275 or use our
VISA/MasterCard Information Form.
- C. The document data will be provided to you by the LDC.
- Contacting LDC:
- Linguistic Data Consortium
- 3600 Market Street
- Suite 810
- Philadelphia, PA, 19104-2653, USA
- General Office Telephone:+1(215)898-0464
- @@@@Membership Office Telephone: +1(215) 573-1275
- Fax:+1(215)573-2175
- Email: ldc
- ATTN: Ms Ilya Ahtaridis, Membership Coordinator
3. SCOPE OF THE LICENSE
This license is valid until September 30, 2012 only, at which time User
agrees to delete the Data and any files and software derived from it from
any computer or media onto which it has been copied and to return all media
to the LDC. User may keep the data by agreeing to pay the LDC the non-member fee and
signing the generic LDC nonmember user agreement.
For the detailed conditions, please consult the agreement.
4. CONVERSION OF LDC DOCUMENT DATA INTO NTCIR FORMAT
The documents in the obtained Corpus shall be converted into the NTCIR
standard document format by the script nyt2ntc.pl and xie2ntc.pl.