NTCIR-13 QA Lab-3

フェーズ1のフォーマルラン実施中

登録は2017年5月1日月曜日(SST)まで

タスク 重要な日程 登録はこちら

Contents

Task

Abstract

The goal of the third QA Lab (Question Answering Lab for Entrance Exam) task at NTCIR 13 is to investigate the real-world complex Question Answering (QA) technologies as a joint effort of participants and appropriate evaluation metrics and methodologies for them.

Based on the lessons learned from NTCIR-11 and -12, the major challenges include

  1. essay questions that require logical summaries along a historical theme
  2. competition with more than 3,500 students, examinees, from all over Japan (JA only)
  3. questions with context
  4. answer by text as high-compress-ratio query-biased summarization
  5. advanced entity-focused passage retrieval
  6. enhance knowledge resources
  7. semantic representation and sophisticated learning
  8. appropriate evaluation measure for essay
  9. research run using the past QA Lab data/systems

Research run investigates how much the QA technologies improved from QA Lab-1.

To tackle to them, we propose to

  1. enhance question format types ontology as joint effort,
  2. define enhanced answer type,
  3. evaluate end-to-end runs as well as vertical investigation runs according to question format type – answer-type – knowledge needed rather than the horizontal integration planned in NTCIR-11,
  4. collect and share more knowledge resources (e.g. dictionaries, chronological tables of historical events, gazetteers, biographical dictionaries), and baseline annotated corpus. Japan’s university entrance examination is selected here, but theoretically the framework can be applicable other domains. Participation for limited-types of question or limited types of modules are possible.

Motivation

Architectures of the advanced information access (IA) systems including QA are complex. In NTCIR-11, we proposed a horizontal module-based pipeline, where consisting of “question format type classifier”, “answer type classifier”, “retrieval”, “information extraction”, “answer generation”, and provided two open-source UIMA module-based end-to-end QA systems for Japanese and English, and one open-source passage retrieval to enhance the module-based collaboration. However, In NTCIR-11 participants used many different types of pipelines according to question format types, answer types, and knowledge sources used. The simple horizontal component-based integration was well-worked in the past NTCIR’s QA tasks with fixed format at NTCIR, but not for the real-world highly complex questions like exams. Questions are consisted with often multiple sentences and with text explaining context. Retrieving passage contacting candidate answer is also challenging - Simple bag-of words retrieval often failed if ignore the types of entity (especially time and place) and low-frequency but critical entities. Answer by short text is too short for the existing complex QA system and needs a break through using advance summarization, argumental structure analysis, and/or, text generation. We would like to organize a research forum to tackling to them using the same infrastructure, and investigate the vertical integration of runs based on the estimation of merit for easy question types or any other factors.

Methodology

The characteristics of the methodologies can be summarized as below -

  1. Open Advancement: We encourage each participant to work with own purpose(s) on end-to-end system, on particular question types and/or component(s) either of the QA platform provided or own system, or to build any resources/tools usable to improve QA systems for entrance exams.
  2. Evaluating continuous progress and Enhance the knowledge resources: The organizers run all the components contributed from participants periodically to see the progress.
  3. Vertical type evaluation: Evaluation is conducted question-type based and see how each module behaves in the end-to-end systems. This is crucial for complex IA systems like QA.
  4. Forum: We place emphasis on building a community by bridging different communities.

Expected results – The improvement of (semi-)automated evaluation method for essay questions. New approach of IA evaluation – not only evaluating each systems, but also participants can participate only for a module to contribute to the open collaboration among participants rather. The improvement of the question answering technologies and cumulate the experience to apply specific domains. Developed technologies for essay questions may be applicable to other than QA domain such as summarization.

How challenges identified in the past rounds can be addressed in this round? – We found the following problems via the NTCIR-12 QA Lab-2 task. The number of teams participating in essay questions was 3, which was small in comparison with 12 teams for multiple choice questions. The best score of essay questions was 9/26 by human expert’s evaluation, which was lower rate than 76/100 of multiple choice questions. ROUGE and pyramid method scores, which were used as evaluation measure for essay questions, do not always agree with human expert’s marks although there was a weak/moderate positive correlation.

The causes are

  1. that End-to-End QA for essay questions is too complex and laborious to tackle it.
  2. that evaluation measure for essay questions is so obscure that participants cannot improve their system.

Therefore, we address the problems about essay questions by setting up the following new tasks: Extraction task and Summarization task, which are subtasks by dividing End-to-End task into two parts, in order to easily participate the tasks. Evaluation method task in order to discuss appropriate measures for evaluating answers for essay questions.

Task design

We design the following tasks.

For multiple-choice questions

For term questions

For essay questions

In addition to the above-mentioned tasks, we plan competitive run like the QA Lab-2 Phase-2 run, collaborating with Todai Robot Project. The competitive run includes essay and term questions but it does not include a multiple-choice question.

Evaluation

For non-essay questions, we plan to evaluate outputs in the same manner as QA Lab-2. For essay questions, we will discuss the evaluating essays via our mailing list, round table meeting, Evaluation method task and so on. At present, we contemplate the following evaluation

For Extraction task

Precision and Recall of passage including statements in Gold standard essay

For Summarization task / End-to-End task

Human experts’ marks, ROUGE method and Pyramid method

For Evaluation mathod task

Agreement with ranking or marks by human experts

Sample Questions

Table 1: Sample Questions
QuestionEnglishJapanese
Multiple-choice question HTMLXML HTMLXML
Term question HTMLXML HTMLXML
Essay question HTMLXML HTMLXML

Submission

Format

Nugget XML Specification

Filename

file name: initial part of question ID

Node and Attribute Definition

doc: root

exam: question
 ・id: question ID
 ・type: complex essay or simple essay
 ・extra: extraction from question

answer: gold standard
 ・annotator: writer (1,2,3)
 ・value: content of the gold standard

proposition: proposition nugget
 ・id: serial number
 ・value: content of the proposition nugget

evaluation: evaluation
 ・evaluator: evaluator (1,2,3)
 ・value: score of evaluation

Structure

doc
  └ exam
      └ answer
          └ proposition
              ├ evaluation
              ├ evaluation
              └ evaluation

doc(1):exam(+)
exam(1):answer(3)
answer(1):proposition(+)
proposition(1):evaluation(3)

+:1 or more, *:0 or more

Example
<?xml version="1.0" encoding="UTF-8"?>
<doc>
  <exam id="D792W10_[1]" type="complex essay" extra="We are currently living in an era of information revolution,">
    <answer annotator="1" value="In the 19th and 20th centuries a transportation and communication revolution occurred with the development of rail networks, the invention of the steamship, the use of Morse code telegraphs, and the invention of the Marconi wireless sets. These technologies were used in Western advances into Asia and Africa. For example, the opening of the Suez Canal greatly reduced travel time between Europe and Asia, but the British controlled the canal to maintain a route to India. With regard to African policy, Britain supported Cecil Rhodes, who advocated a plan for connecting Cape Town to Cairo via rail and telegraph. Germany implemented a 3B policy, which included the construction of the Baghdad Railway, to advance into the Middle East. The western powers were able to quickly suppress the Boxer Rebellion in the late Qing thanks to rapid information communication. Russia aimed to advance into the far East through, for example, the Siberian Railroad, but the Japanese victory in the Russo-Japanese War was quickly relayed to the world via information networks, creating a rise in ethnic consciousness around the world, influencing, for example, the Young Turks movement and the Iranian Constitutional Revolution. The development of transportation methods also led to an increase in the number of people coming from non-Western countries to study in the West. Gandhi, leader of the Indian independence movement, was one example of this, his career founded on his work as a lawyer in South Africa after studying in the U.K." length="243">
        <proposition id="1" value="During the 19th and 20th centuries, railroad networks developed.">
          <evaluation evaluator="1" value="2"/>
          <evaluation evaluator="2" value="1"/>
          <evaluation evaluator="3" value="1"/>
        </proposition>
        <proposition id="2" value="The steamship was invented in the 19th century.">
          <evaluation evaluator="1" value="3"/>
          <evaluation evaluator="2" value="3"/>
          <evaluation evaluator="3" value="1"/>
        </proposition>
        and so on.
    </answer>
    <answer annotator="2" value="The opening of the Suez Canal dramatically reduced sea travel time to Asia, and the appearance of steam-powered steamships made advances from European countries to Asia even easier, greatly contributing to trade and colonialization. The invention of Morse code and the Marconi wireless set in the 19th century greatly improved the speed with which information could be communicated, making global-scale colony management possible. The construction of railroads was also important not only for its economy but also as a method of colony management. Germany, which sought to expand its power in the Middle East, acquired the rights to lay the Baghdad Railway from the Ottoman Empire. However, the improvement of communications technologies and increased mobility also had a major impact on nationalist movements in Asia and Africa. The Russo-Japanese War was triggered by problems regarding the withdrawal of troops after the Boxer Rebellion, and Japan’s victory was instantly communicated around the world, stimulating nationalist movements in Asia and Africa. Later, in Iran, people who gained information via media sources such as newspapers started the Iranian Constitutional Revolution, and in the Ottoman Empire the Young Turk Revolution occurred. The greater ease of mobility led to more people studying abroad in the West, many of which then participated in their own nationalist movements. Having studied in the U.K." length="223">
        <proposition id="1" value="The Suez Canal was opened in Egypt in the 19th century.">
          <evaluation evaluator="1" value="1"/>
          <evaluation evaluator="2" value="3"/>
          <evaluation evaluator="3" value="3"/>
        </proposition>
        <proposition id="2" value="The opening of the Suez Canal helped shorten Asian sea routes.">
          <evaluation evaluator="1" value="1"/>
          <evaluation evaluator="2" value="2"/>
          <evaluation evaluator="3" value="3"/>
        </proposition>
        and so on.
    </answer>
    <answer annotator="3" value="From the mid-19th century to the early 20th century communications and transportation methods advanced dramatically. Communications advances included the development of Morse code and wireless transmission by Marconi. Transportation advances included the opening of the Suez Canal in 1869, which dramatically reduced the time it took steamships to reach Asia, versus the Cape of Good Hope route, supporting Britain’s 3C policy. The Baghdad Railway, which Germany secured the rights to build in 1899, connected Berlin, Byzantium, and Baghdad, leading to the Persian Gulf and further supporting its 3B policy. These improvements in access between countries and their colonies led to stronger administration by European countries of their colonies in Asia and Africa, but at the same time to greater ethnic consciousness in Asia. For example, the Chinese Boxer Rebellion of 1900, the Iranian Constitutional Revolution of 1905 to 1911, and the passive resistance-based anti-British movement led by Gandhi from 1919 were all anti-imperialist movements. The Russo-Japanese War of 1904 was also welcomed as a victory of people of color over whites." length="171">
        <proposition id="1" value="From the mid-19th century to the early 20th century, communications and transportation methods advanced dramatically.">
          <evaluation evaluator="1" value="2"/>
          <evaluation evaluator="2" value="1"/>
          <evaluation evaluator="3" value="1"/>
        </proposition>
        <proposition id="2" value="Advances in communications methods from the mid-19th century to the early 20th century included the invention of Morse code and Marconi’s invention of wireless transmission.">
          <evaluation evaluator="1" value="3"/>
          <evaluation evaluator="2" value="3"/>
          <evaluation evaluator="3" value="3"/>
        </proposition>
        and so on.
    </answer>
  </exam>
  <exam id="D792W10_[2]_(A)_Question (5)" type="simple essay" extra="The Italian Wars were waged for over half a century during the Renaissance period. Describe, in no more than 30 English words, the political situation in Italy that gave rise to these conflicts.">
    <answer annotator="1" value="Italy at the time was divided into city-states, papal states, and feudal states, and France and the Holy Roman Empire became involved, vying for hegemony over Italy." length="27">
        <proposition id="1" value="Renaissance-era Italy was divided into city-states, papal states, and feudal states.">
          <evaluation evaluator="1" value="3"/>
          <evaluation evaluator="2" value="3"/>
          <evaluation evaluator="3" value="3"/>
        </proposition>
        <proposition id="2" value="France and the Holy Roman Empire vied for hegemony over Italy.">
          <evaluation evaluator="1" value="2"/>
          <evaluation evaluator="2" value="3"/>
          <evaluation evaluator="3" value="3"/>
        </proposition>
        and so on.
    </answer>
  </exam>
  <exam id="D792W10_[2]_(B)_Question (7)" type="simple essay" extra="Confucianism is an idea that emerged in ancient China and has been widely studied in East Asia.">
    <answer annotator="1" value="The Cheng-Zhu school made successful during the Song by Chu Hsi later became orthodox Confucianism." length="15">
        <proposition id="1" value="During the Song era Chu Hsi made the Cheng-Zhu school a success.">
          <evaluation evaluator="1" value="3"/>
          <evaluation evaluator="2" value="3"/>
          <evaluation evaluator="3" value="3"/>
        </proposition>
        <proposition id="2" value="The Cheng-Zhu school later became the orthodox school of Confucianism.">
          <evaluation evaluator="1" value="2"/>
          <evaluation evaluator="2" value="1"/>
          <evaluation evaluator="3" value="3"/>
        </proposition>
    </answer>
  </exam>
</doc>

Submission Format Specification

QA Lab-3 organizers would like to receive results and system descriptions through Bitbucket.

Each participant team:

  1. have to create a private Bitbucket repository for submission,
  2. add the read permission to the organizer's account (https://bitbucket.org/ntcirqalab/),
  3. and upload the team results and system descriptions to the repository.

File structure of the submission repository is as below.

    /-|
      |-phase1/
      |   |-mutiple_choice/
      |   |-term/
      |   |-essay/
      |      |-e2e/
      |      |    |-run1/
      |      |    |    |-[RUN_ID](.xml) (result)
      |      |    |    |-README.md or README.textile (system description)
      |      |    |
      |      |    |-run2/
      |      |    |-run3/
      |      |-extraction/
      |      |-summarization/
      |      |-evaluation_method/
      |
      |-phase2/
      |   |- ...
      |
      |-research_run/
          |- ...

RUN_ID for Essay and Term questions

[Topics' (Questions') File Name without the Extension (.xml)]_[Team ID]_[Run Type]_[Priority].xml

Run Type codes are as follows:

Priority Parameter:

The "Priority" is two digits used to represent the priority of the run, taking 01 as the highest. It will be used as a parameter for pooling and priority to be evaluated and analyzed the results. The number of the runs included in the evaluation may vary according to the total number of submissions.

ex1)

Topics' (Questions') File Name:
"qalab3-ja-phase1-answersheet-essay.xml"
TEAM:
QALabOrganizers
LANGUAGE:
Japanese
PHASE:
Phase-1
TASK:
essay question's end-to-end task
# of SYSTEMS:
3
The run IDs of the three systems become:

"qalab3-ja-phase1-answersheet-essay_QALabOrganizers_e2e_01",
"qalab3-ja-phase1-answersheet-essay_QALabOrganizers_e2e_02" and
"qalab3-ja-phase1-answersheet-essay_QALabOrganizers_e2e_03".

ex2)

Topics' (Questions') File Name:
"qalab3-en-phase1-answersheet-essay.xml"
TEAM:
QALabOrganizers
LANGUAGE:
English
PHASE:
Phase-1
TASK:
essay question's evaluation method task
# of SYSTEMS:
2
The run IDs of the three systems become:

"qalab3-en-phase1-answersheet-essay_QALabOrganizers_evaluation-method_01" and
"qalab3-en-phase1-answersheet-essay_QALabOrganizers_evaluation-method_02".

ex3)

Topics' (Questions') File Name:
"qalab3-ja-phase2-answersheet-term.xml"
TEAM:
QALabOrganizers
LANGUAGE:
Japanese
PHASE:
Phase-2
TASK:
term question's end-to-end task
# of SYSTEMS:
3
The run IDs of the three systems become:

"qalab3-ja-phase2-answersheet-term_QALabOrganizers_term_01",
"qalab3-ja-phase2-answersheet-term_QALabOrganizers_term_02" and
"qalab3-ja-phase2-answersheet-term_QALabOrganizers_term_03".

RUN_ID for Multiple-Choice questions:

[Topics' (Questions') File Name without the Extension (.xml)]_[Team ID]_[Language]_[Priority].xml

Two character language codes are as follows:

ex4)

Topics' (Questions') File Name:
Center-1997--Main-SekaishiB.xml
Center-1999--Main-SekaishiB.xml
Center-2001--Main-SekaishiB.xml
TEAM:
QALabOrganizers
LANGUAGE:
Japanese
PHASE:
Phase-2
TASK:
multiple-choice question's end-to-end task
# of SYSTEMS:
3
The run IDs of the three systems become:

XML Formats for Essay questions

End-to-end Resluts XML Format

End-to-End QA participants will submit their output (answers) in this format.

XML Schema

"second_stage_exam.xsd" is available from the Download website for each participating Team although it is refered its URL in the XML file.

Sample XML Format
Show Example
<?xml version="1.0" encoding="UTF-8"?>
<answer_sheet ver="0.1">
  <answer_section id="D792W10-1" label="D792W10_[1]">
    <question_id>q01</question_id>
    <answer_style>description_unlimited</answer_style>
    <answer_type>sentence</answer_type>
    <knowledge_type>RT</knowledge_type>
    <instruction><p>Write an essay explaining, in 225 English words or less, how developments in the means of transportation and communication prompted the colonization of Asia and Africa and heightened local nationalism. Use all ninekeywordsshown below at least once.</p></instruction>
    <reference_set>
      <reference format="data" id="d01" is_directly_referred="0">We are currently living in an era of information revolution, and the pace of globalization is accelerating. Not only people and goods are flowing across oceans and borders with increasing frequency; information is being transmitted across the world almost in real time. Underlying these developments is the rapid progress that has been made in transportation and communication technologies. Looking back over human history, we can find numerous instances where new developments in the means of transportation and communication have played important roles. In particular, from the mid-19th century through the early 20th century, such technological advances as wired and wireless telegraphy, the telephone, the camera, and cinematography came into practical use, resulting in the revolution that was audio-visual media. Furthermore, these technological innovations are noteworthy for the parts they played in Western nations' invasions of Asia and Africa. For example, the Reuters news agency gathered information from around the world to help develop the international presence of the British Empire. But, on the other hand, global information sharing and accelerated migration facilitated by the development of transportation were also stimulating factors in the growth of local nationalism.</reference>
      <reference format="data" id="d02" is_directly_referred="1">Suez Canal, steamship, Baghdad Railway, Morse code, Marconi, the Boxers, Russo-Japanese War, Persian Constitutional Revolution, Gandhi</reference>
    </reference_set>
    <keyword_set>
      <keyword>Suez Canal</keyword>
      <keyword>steamship</keyword>
      <keyword>Baghdad Railway</keyword>
      <keyword>Morse code</keyword>
      <keyword>Marconi</keyword>
      <keyword>the Boxers</keyword>
      <keyword>Russo-Japanese War</keyword>
      <keyword>Persian Constitutional Revolution</keyword>
      <keyword>Gandhi</keyword>
    </keyword_set>
    <answer_set type="singleton" number="1" >
      <answer match_type="broad" order="-1" choices="" format_string="" length_limit="no more than 255 English words">
        <expression_set>
          <expression>(PREASE WRITE YOUR ANSWER)</expression>
                  </expression_set>
      </answer>
    </answer_set>
  </answer_section>
    :
    :
  <answer_section id="M792W10-6" label="M792W10_[2]_Question (3)_(b)">
    <question_id>q09</question_id>
    <answer_style>description_limited</answer_style>
    <answer_type>sentence</answer_type>
    <knowledge_type>KS</knowledge_type>
    <grand_question_set>
      <grand_question id="q02">Historically, nations called 'empires' mostly governed large areas that included multiple ethnic groups, races, and religions. In relation to the rise and fall of such empires, their differences and similarities, and their domestic and international relationships, answer the three questions below. Write your responses in answer section (B). Begin a new line for every question you answer and state the corresponding number (1) to (3) at the beginning.</grand_question>
      <grand_question id="q07">Answer the questions below in relation to the underlined sections (a)and(b), and state the corresponding letter (a) or (b) at the beginning.</grand_question>
    </grand_question_set>
    <instruction><p>Describe the characteristics of U.S. policy toward China after the Spanish-American War, in no more than 45 English words.</p></instruction>
    <reference_set>
    </reference_set>
    <answer_set type="singleton" number="1" >
      <answer match_type="broad" order="-1" choices="" format_string="" length_limit="no more than 45 English words">
        <expression_set>
          <expression>(PREASE WRITE YORE ANSWER)</expression>
        </expression_set>
      </answer>
    </answer_set>
  </answer_section>
</answer_sheet>

Extraction XML Format

Sample XML format
Show Example
<?xml version="1.0" encoding="UTF-8"?>
  <TOPIC ID="D792W10-1">
    <PASSAGE_SET>
      <PASSAGE RANK="1" SOURCE_ID="http://***" SOURCE_ID_TYPE="Web" SCORE="**" NORMALIZED_SCORE="1">
        ####
      </PASSAGE>
      <PASSAGE RANK="2" SOURCE_ID="WA-24" SOURCE_ID_TYPE="QALab3" SCORE="**" NORMALIZED_SCORE="0.5">
        #####
      </PASSAGE>
      <PASSAGE RANK="3" SOURCE_ID="(DOCUMENT_ID)" SOURCE_ID_TYPE="PARTICIPANT" SCORE="**" NORMALIZED_SCORE="0.1">
        #####
      </PASSAGE>
    </PASSAGE_SET>
  </TOPIC>

ID has a question id.

The DOCUMENT node is up to ten times the number of characters in Japanese, up to ten times the number of words in English, based on the limit number of characters (words) of the answer.

PASSAGE
SOURCE_ID attribute has a DOCUMENT_ID or a URL.

SOURCE_ID_TYPE attribute has one of the following types:
  • WEB(search from web)
  • QALAB3(documents delivered by QA-Lab3)
  • PARTICIPANT(documents prepared by each participating team, please describe in the system description form)

SCORE attribute has any value.

NOMALIZED_SCORE attribute has any value between and including 0 and 1, higher is better.

Summarization Results XML Format

Sample XML Format
Show Example
<?xml version="1.0" encoding="UTF-8"?>
<answer_sheet ver="0.1" src="(PREASE WRITE A EXTRACTION FILE NAME)">
  <answer_section id="D792W10-1" label="D792W10_[1]">
    <question_id>q01</question_id>
    <answer_style>description_unlimited</answer_style>
    <answer_type>sentence</answer_type>
    <knowledge_type>RT</knowledge_type>
    <instruction><p>Write an essay explaining, in 225 English words or less, how developments in the means of transportation and communication prompted the colonization of Asia and Africa and heightened local nationalism. Use all ninekeywordsshown below at least once.</p></instruction>
    <reference_set>
      <reference format="data" id="d01" is_directly_referred="0">We are currently living in an era of information revolution, and the pace of globalization is accelerating. Not only people and goods are flowing across oceans and borders with increasing frequency; information is being transmitted across the world almost in real time. Underlying these developments is the rapid progress that has been made in transportation and communication technologies. Looking back over human history, we can find numerous instances where new developments in the means of transportation and communication have played important roles. In particular, from the mid-19th century through the early 20th century, such technological advances as wired and wireless telegraphy, the telephone, the camera, and cinematography came into practical use, resulting in the revolution that was audio-visual media. Furthermore, these technological innovations are noteworthy for the parts they played in Western nations' invasions of Asia and Africa. For example, the Reuters news agency gathered information from around the world to help develop the international presence of the British Empire. But, on the other hand, global information sharing and accelerated migration facilitated by the development of transportation were also stimulating factors in the growth of local nationalism.</reference>
      <reference format="data" id="d02" is_directly_referred="1">Suez Canal, steamship, Baghdad Railway, Morse code, Marconi, the Boxers, Russo-Japanese War, Persian Constitutional Revolution, Gandhi</reference>
    </reference_set>
    <keyword_set>
      <keyword>Suez Canal</keyword>
      <keyword>steamship</keyword>
      <keyword>Baghdad Railway</keyword>
      <keyword>Morse code</keyword>
      <keyword>Marconi</keyword>
      <keyword>the Boxers</keyword>
      <keyword>Russo-Japanese War</keyword>
      <keyword>Persian Constitutional Revolution</keyword>
      <keyword>Gandhi</keyword>
    </keyword_set>
    <answer_set type="singleton" number="1" >
      <answer match_type="broad" order="-1" choices="" format_string="" length_limit="no more than 255 English words">
        <expression_set>
          <expression source_id="(passage_id1),(passage_id2),(passage_id3)">(PREASE WRITE YORE ANSWER)</expression>
                                </expression_set>
      </answer>
    </answer_set>
  </answer_section>
    :
    :
  <answer_section id="M792W10-6" label="M792W10_[2]_Question (3)_(b)">
    <question_id>q09</question_id>
    <answer_style>description_limited</answer_style>
    <answer_type>sentence</answer_type>
    <knowledge_type>KS</knowledge_type>
    <grand_question_set>
      <grand_question id="q02">Historically, nations called 'empires' mostly governed large areas that included multiple ethnic groups, races, and religions. In relation to the rise and fall of such empires, their differences and similarities, and their domestic and international relationships, answer the three questions below. Write your responses in answer section (B). Begin a new line for every question you answer and state the corresponding number (1) to (3) at the beginning.</grand_question>
      <grand_question id="q07">Answer the questions below in relation to the underlined sections (a)and(b), and state the corresponding letter (a) or (b) at the beginning.</grand_question>
    </grand_question_set>
    <instruction><p>Describe the characteristics of U.S. policy toward China after the Spanish-American War, in no more than 45 English words.</p></instruction>
    <reference_set>
    </reference_set>
    <answer_set type="singleton" number="1" >
      <answer match_type="broad" order="-1" choices="" format_string="" length_limit="no more than 45 English words">
        <expression_set>
          <expression source_id="(passage_id1),(passage_id2),(passage_id3)">(PREASE WRITE YORE ANSWER)</expression>
        </expression_set>
      </answer>
    </answer_set>
  </answer_section>
</answer_sheet>

Summarization Results XML Format is almost same as End-to-end Results XML Format.

Src attribute has a file name of a extract result that you receive.

Source_id attribute has a SOUCE_ID of the passages used to make an answer.

Evaluation method Results XML Format

Sample XML Format
Show Example
<?xml version="1.0" encoding="UTF-8"?>
<TOPIC ID="D792W10-1">
   <ANSWER_SET>
      <ANSWER FILE_NAME="(PREASE WRITE A E2E OR SUMMARIZATION RESULT FILE NAME)" RANK="1" SCORE="26">(ANSWER)</ANSWER>
      <ANSWER FILE_NAME="(PREASE WRITE A E2E OR SUMMARIZATION RESULT FILE NAME)" RANK="2" SCORE="24">(ANSWER)</ANSWER>
   </ANSWER_SET>
</TOPIC>

ID attribute has a question id.

RANK attribute has an integer of 1 or more. Two nodes which have the same parent node are not allowed to have the same rank.

SCORE attribute has any value.

FILE_NAME attribute has a file name of a e2e result or a summarization result.

Answer Sheet XML Format for Term questions

End-to-End QA participants will submit their output (answers) in this format. Also use this format for the combination run.

XML Schema
Show Example
<?xml version="1.0" encoding="UTF-8"?>
<answer_sheet ver="0.1">
<answer_section id="A792W10-2" label="A792W10_[2]_(A)_Question (1)_(a)">
<question_id>q04</question_id>
<answer_style>description_unlimited</answer_style>
<answer_type>term_person</answer_type>
<knowledge_type>KS</knowledge_type>
<grand_question_set>
<grand_question id="q02">Unification and exclusion movements based on ethnicity, religion, and the like are seen throughout history, throughout the world. Answer questions (1) through (10) below regarding these movements. Write your answers in answer space (B). Use a separate line for each question, and write the number of the question ((1) through (10)) before each answer.</grand_question>
<grand_question id="q03">Thetablebelow is an excerpt of statistical data regarding minority ethnicities in the People's Republic of China as of 1990. Answer the following questions regarding thetable.</grand_question>
</grand_question_set>
<instruction><p>Some of those of ethnicity(1)became independent of China as the result of an independence movement influenced by the Russian Revolution.<label_set><label id="a" focus="1">(a)</label></label_set>Write the name of the central organization of this movement, and<label_set><label id="b" focus="0">(b)</label></label_set>the name of one of its leaders.</p></instruction>
<reference_set>
<reference format="lText" id="l01" is_directly_referred="1">Mongolian</reference>
</reference_set>
<answer_set type="singleton" number="1" >
<answer match_type="exact" order="-1" choices="" format_string="" length_limit="-1">
<expression_set>
<expression>(PREASE WRITE YOUR ANSWER)</expression>
</expression_set>
</answer>
</answer_set>
</answer_section>
<answer_section id="A792W10-3" label="A792W10_[2]_(A)_Question (1)_(b)">
<question_id>q04</question_id>
<answer_style>description_unlimited</answer_style>
<answer_type>term_person</answer_type>
<knowledge_type>KS</knowledge_type>
<grand_question_set>
<grand_question id="q02">Unification and exclusion movements based on ethnicity, religion, and the like are seen throughout history, throughout the world. Answer questions (1) through (10) below regarding these movements. Write your answers in answer space (B). Use a separate line for each question, and write the number of the question ((1) through (10)) before each answer.</grand_question>
<grand_question id="q03">Thetablebelow is an excerpt of statistical data regarding minority ethnicities in the People's Republic of China as of 1990. Answer the following questions regarding thetable.</grand_question>
</grand_question_set>
<instruction><p>Some of those of ethnicity(1)became independent of China as the result of an independence movement influenced by the Russian Revolution.<label_set><label id="a" focus="0">(a)</label></label_set>Write the name of the central organization of this movement, and<label_set><label id="b" focus="1">(b)</label></label_set>the name of one of its leaders.</p></instruction>
<reference_set>
<reference format="lText" id="l01" is_directly_referred="1">Mongolian</reference>
</reference_set>
<answer_set type="singleton" number="1" >
<answer match_type="exact" order="-1" choices="" format_string="" length_limit="-1">
<expression_set>
<expression>(PREASE WRITE YORE ANSWER)</expression>
</expression_set>
</answer>
</answer_set>
</answer_section>

Note that the answers have to be wrote down in the expression node on the answer_sheet.

Answer Sheet XML Format for Multiple-Choice questions

Multiple-Choice questions task participants will submit their output (answers) in this format. Basically the same as the GoldStandard Format.

DTD

"answerTable.dtd" is available from the Download website for each participating Team although it is refered its URL in the XML file.

Sample XML Format
Show Example
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE answerTable SYSTEM "http://21robot.org/answerTable.dtd">
<answerTable filename="Center-2009--Main-SekaishiB">
<data>
<answer>3</answer>
<anscolumn_ID>A1</anscolumn_ID>
</data>
<data>
<answer>1</answer>
<anscolumn_ID>A2</anscolumn_ID>
</data>
<data>
<answer>1</answer>
<anscolumn_ID>A3</anscolumn_ID>
</data>
<data>
<answer>4</answer>
<anscolumn_ID>A4</anscolumn_ID>
</data>
<data>
<answer>1</answer>
<anscolumn_ID>A5</anscolumn_ID>
</data>
<data>
<answer>3</answer>
<anscolumn_ID>A6</anscolumn_ID>
</data>
  :
  :
</answerTable>

Results