NTCIR (NII Test Collection for IR Systems) Project NTCIRCONTACT INFORMATIONNII
NTCIR HOME

NTCIR-8 HOME
NTCIR-8 MEETING
TASK DESCRIPTION
・TASK INFORMATION
ACLIA
GeoTime
MOAT
PAT-MN
PAT-MT
PILOT TASK
HOW TO PARTICIPATE
DATA
IMPORTANT DATES
USER AGREEMENT FORMS
CONTACT INFORMATION
MAILING LISTS
ONLINE PROCEEDINGS
NTCIR HOMEへ

The 8th NTCIR Workshop
NTCIR-8 MOAT Evaluation Agreement Forms - New York Times Text and Xinhua Chinese Text

[NTCIR-8 User Ugreement Forms]


# This page is available English only.

1. INTRODUCTION

NTCIR-8 MOAT Task Participant Test Collection consists of A. Document Data and B. Task Data.

A. Document Data
a.1 Chinese (simplified) Dataset
"Xinhua Chinese Text (2002-2005)"(Simplified Chinese Dataset for Formal Run), or "LDC2009E75 NTCIR-8 Xinhua Chinese Data 2002-2005", is available for the participants of the NTCIR-8 MOAT Task only for the purposes of the NTCIR Workshop.
Xinhua Chinese Text (2002-2005) is also included in the following LDC corpus:
LDC2007T38: Chinese Gigaword Third Edition, which released on Aug 17, 2007.
If you have the above one, you do not need to newly obtain the corpus.
a.2 English Dataset
"New York Times Text (2002-2005)"(English Dataset for Formal Run) , or "LDC2009E74 NTCIR-8 New York Times Data 2002-2005", is available for the participants of the NTCIR-8 MOAT Task only for the purposes of the NTCIR Workshop.

New York Times Text (2002-2005) is also included in the following LDC corpus:
LDC2007T07: English Gigaword Third Edition, which released on May 17, 2007.
If you have the above one, you do not need to newly obtain the corpus.

The document data will be provided to you by sending a DVD-ROM from the LDC. If you will obtain these documents from the LDC, you will be asked to pay $US50 to cover some portion of costs in preparing and shipping the data.

B. Task Data

For the system training data, the following annotated Xinhua Text are available for the participants of the NTCIR-8 MOAT Task from the LDC.

  • "LDC2009E76 Xinhua English Tagged Data 1998-2001"
    "LDC2009E77 Xinhua Chinese Tagged Data 1998-2001"
    "LDC2006E108 NTCIR Opinion Annotation Pilot Task (Xinhua English Annotated Data 1998-2001)"
This is only a portion of the data for the NTCIR-8 MOAT Task. The rest of the data can be obtained directly from NTCIR after filling out and sending the two forms below.

2. HOW TO OBTAIN THE DATA

A. Sign the License Agreement:
(1) Register to participate in the MOAT task at NTCIR-8
The LDC will grant the license to the registered participants.

(2) Download the LDC's "NTCIR-8 MOAT Evaluation Agreement"

(3) Complete and sign the agreement.

(4) Fax or scan and email a signed agreement to the Linguistic Data Consortium (LDC).

Fax:+1(215)573-2175
Email: ldc@ldc.upenn.edu
ATTN: Ms Ilya Ahtaridis, Membership Coordinator


B. Making payment

Payment of Corpora Fees can be made in one of three ways:

1. with a check from a bank with branches in the United States
For credit to The Trustees of the University of Pennsylvania.

2. with a wire to:

Wachovia Bank NA
123 South Broad Street
Philadelphia, PA 19109

ABA NO. 031201467

Account No. 2000018692644

SWIFT CODE: PNBPUS33PHL
For credit to The Trustees of the University of Pennsylvania
Attn:Ms. Ilya Ahtaridis +1(215) 573 1275

3. with Visa or MasterCard
Please provide the following:

1. Type of credit card
2. Credit card number
3. Expiration date
4. Credit card billing address

Please mail checks to the LDC, but note that they should credit 'the Trustees of the University of Pennsylvania'.

For security purposes, please do not provide credit card details by email. It is recommended that you call the LDC at +1(215) 573-1275 or use our VISA/MasterCard Information Form.

C. The document data will be provided to you by the LDC.
Contacting LDC:
Linguistic Data Consortium
3600 Market Street
Suite 810
Philadelphia, PA, 19104-2653, USA
General Office Telephone:+1(215)898-0464    Membership Office Telephone: +1(215) 573-1275
Fax:+1(215)573-2175
Email: ldc@ldc.upenn.edu
ATTN: Ms Ilya Ahtaridis, Membership Coordinator

3. SCOPE OF THE LICENSE

After User’s participation in the NTCIR-8 Multilingual Opinion Analysis Task has ended, User agrees to delete the Data from any computer or media onto which it has been copied and to return all discs to the LDC, except that User may use LDC2006E108 NTCIR Opinion Annotation Pilot Task (Xinhua English Annotated Data 1998-2001) after the NTCIR-8 Multilingual Opinion Analysis Task has ended for Opinion Analysis research.
User may keep the data by agreeing to pay the LDC the non-member fee and signing the generic LDC nonmember user agreement.

 

For the detailed conditions, please consult the agreement.


4. CONVERSION OF LDC DOCUMENT DATA INTO NTCIR FORMAT

The documents in the obtained Corpora shall be converted into the NTCIR standard document format by the scripts nyt2ntc.pl.(for New York Times text) and xin2ntc-new.pl(for Xinhua Chinese Text).


[NTCIR-8 User Ugreement Forms]

contact; ntc-admin
2009-07-14