The NTCIR-8 community QA pilot task submission instructions (March 29, 2010) ### task definition You have a set of questions posted to the Yahoo!Chiebukuro site, and for each question you have a set of posted answers. Each question has exactly one "best answer" selected subjectively by the questioner. This is the questioner's *favourite* answer among the posted answers. Since a favourite answer may not necessarily be of the highest quality, we hired four assessors to judge the quality of every answer in absolute terms. Using a majority vote appraoch, we created for each question a set of answers with graded relevance (answer quality). We refer to this as the "good answers set". This set is independent of the questioner's "best answer." Your task is to rank all posted answers by answer quality (as estimated by your system) for every question. ### evaluation methods We plan to evaluate your runs (your system output files submitted to the organisers) using at least three methods: (1) Evaluating your ranked list of answers, by comparing it with an "gold-standard" ranked list based on the *good answers set*, using graded relevance evaluation metrics (nDCG and Q-measure). (2) Evaluating only the top ranked answer in your run, by comparing it with the *good answers set*, using n(D)CG at rank 1 and Hit at rank 1 (i.e. whether the answer is correct or not). Note: at rank 1, nDCG (normalised discounted cumulative gain) equals nCG (normalised cumulative gain). (3) Evaluating only the top ranked answer in your run, by comparing it with the "best answer" selected by the questioner, using Hit at rank 1. ### run filename Your run filename must be of the form: e.g., should be of the form [A-z]+[0-9]* Do not use any non-alphanumeric characters. should be an integer: [1-5]. Thus, each team is allowed to submit up to five runs. "-" will be referred to as your "runID." ### run file content ,,,... : e.g. 23425,92012,91994,91985,91957,91857,91929,91939,91849,91886,91906,91875,91995,91846 44747,221230,221228,221209,222154 : - Field separator: a comma. - Each line corresponds to one question. The first field is the questionID (Q_ID). The 2nd to N-th fields are the answerIDs (A_ID) *sorted by your system according to the likelihood of being a good answer*, where N is the total number of posted answers for the question. That is, you are expected to rank ALL answers. - The run file must contain all questionIDs. Hence, the number of lines in your runfile should equal the number of questions, namely, 1500. ### submission method: Send your runfiles as email attachments to ntcadm-yahoo at nii ac jp ### submission deadline: April 6, 2010 (GMT). We wish you good luck and look forward to meeting you at NTCIR-8 in June. NTCIR-8 community QA pilot task organisers Daisuke Ishikawa Noriko Kando Tetsuya Sakai ntcadm-yahoo at nii ac jp