[Date Prev][Date Next][Date Index]

[ntcir:198] I: Answer Validation Exercise - Call for Participation





__________________________________________________________________

         Question Answering at Cross-Language Evaluation Forum
__________________________________________________________________

                      ANSWER VALIDATION EXERCISE (AVE)
                                  Call for Participation
__________________________________________________________________
                              http://nlp.uned.es/QA/ave/


KEYWORDS

Answer Validation (AV), Question Answering (QA), Recognising Textual
Entailment (RTE)


INTRODUCTION

The objective of the AVE is to promote the development and evaluation of
subsystems aimed at validating the correctness of the answers given by
QA systems. This automatic Answer Validation would be useful for
improving QA systems performance, helping humans in the assessment of QA
systems output, Improving QA systems self-score, developing better
criteria for collaborative QA systems, etc.
Systems must emulate human assessment of QA responses and decide whether
an answer is correct or not according to a given snippet. Once the
answer plus a snippet is given by a QA system, a hypothesis is built
turning the question plus the answer into an affirmative form. If the
given text (a snippet or a document) semantically entails this
hypothesis, then the answer is expected to be correct. The exercise of
deciding this entailment is named here automatic Answer Validation.


EXERCISE (In short)

Participant systems will receive a set of pairs text-hypothesis built
semi-automatically from  QA at CLEF 2006 responses. They
must return an entailment value (YES|NO) for each pair. Results will be
evaluated against the QA human assessments. There is a different
possible exercise for each one of the following languages: English,
Spanish,
Italian, Dutch, French, German, Portuguese and Bulgarian.
Participants can submit up to two runs.
The number of pairs in each exercise depends on the number of
participants in the corresponding Question Answering subtrack.
Guidelines will be available at http://nlp.uned.es/QA/ave/

IMPORTANT DATES

Registration at CLEF http://www.clef-campaign.org (Registration Page)
 Already Open

Registration for the at AVE sub-exercises (languages):
 Already Open

Test set release
 June 26th, 2006

Submission of runs
 July 9th, 2006

Release of individual results
     July 17, 2006

Submission of Paper for Working Notes
 15 August 2006

CLEF - ECDL 2006 Workshop (Alicante, Spain)
 September 20th-22nd


DEVELOPMENT DATA

Spanish
A training corpus named SPARTE [1] has been developed from the Spanish
assessments produced during 2003, 2004 and 2005 editions of
QA at CLEF. SPARTE contains 2804 text-hypothesis pairs
from 635 different questions. All the pairs have a document label and a
TRUE/FALSE value indicating whether the document entails the hypothesis
or not.

English
A similar corpus (ENGARTE) has been developed for English.

Both corpus are available at http://nlp.uned.es/QA/ave/
Document collections are available for registered participants at CLEF.


[1] A. Pe?as, A. Rodrigo, and F. Verdejo. SPARTE, a Test Suite for
Recognising Textual Entailment in Spanish. In A. Gelbukh, editor,
Computational Linguistics and Intelligent Text Processing. CICLing 2006,
Lecture Notes in Computer Science, LNCS 3878, pp. 275-286,
Springer-Verlag, 2006.


ORGANIZATION

NLP Group at UNED, Spain
QA at CLEF, Cross-Language Evaluation Forum


CONTACT
Anselmo Pe?as (UNED)
anselmo@xxxxxxxxxxx