NTCIR Workshop 7 MuST T2N Subtask Evaluation Results
The evaluation of the T2N subtask [MuSTT2Neval (.xls)]
Readme [README (.rtf) ]
- eval SHEET
The evaluation result of the T2N subtask conducted in MuST at NTCIR-7.
The following values are listed in the order of task/stat.
#answer The number of the correct answers, the denominator of the recall
#extracted The number of the system outputs
#effective The number of effective data, which does not overlap
with others and violate the task specification, the denominator of the
#correct The number of data judged to "ok" (both the date and value
are correctly extracted), the numerator of the precision and recall
#date-err The number of data judged to "date?" (the value is correct, but the date does not assigned correctly)
#val-err The number of data judged to "ng" (wrong value such as those of a different statistic)
#others The number of data judged to "maybe" (we could not make a confident judgment)
F-measure calculated from the precision and recall using the common definition
Blank line means that no output is given from a system.
- ans SHEET
Some information on the correct answers of the T2N subtask conducted
in MuST at NTCIR-7. Each line shows the number of the piece of
information that should be extracted by task/stat
#extracted Number of the correct answers that some system could extract
#relative Number of the correct answers that are expressed implicitly in the documents
#none Number of the correct answers that no system could extract