Task Overview and Call for Task Participation

The 17th NTCIR (2022 - 2023)
Evaluation of Information Access Technologies
Conference: December 12-15, 2023, NII, Tokyo, Japan

Call for Participation to the NTCIR-17 Tasks:

Task Participation

Let's participate in a collaborative activity for enhancing Information Access technologies!

For the 20 years, NTCIR has been formulating the infrastructure for the evaluation, and contributing to development of the Information Access technologies. Consequently, NTCIR has been the major forum for researchers to intensively discuss the evaluation methodology of emerging information access technologies.
The 17th NTCIR, NTCIR-17, now calls for task participation of anyone interested in research on information access technologies and their evaluation, such as retrieval from a large amount of document collections, question answering and natural language processing. We welcome students, young researchers, professors who supervise students, researchers working for a company, and anyone who is interested in informatics.

The registration for NTCIR-17 Task Participation has just started.
Please visit: http://research.nii.ac.jp/ntcir/ntcir-17/howto.html

*"Online presentation" will be available at the NTCIR-17 Conference.

NTCIR Aims

Evaluation Tasks

The seventeenth NTCIR (NTCIR-17) Program Committee has selected the following three Core Tasks and three Pilot Tasks.
For task slides of Kick-Off event, please visit:
Slides for task introduce at the NTCIR-17 Kick-Off Event: http://research.nii.ac.jp/ntcir/ntcir-17/kickoff.html
For details and latest information, please see below and visit each task’s homepage.

FinArg-1     Lifelog-5     MedNLP-SC     QA Lab-PoliInfo-4     SS-2    
FairWeb-1     Transfer     UFO     ULTRE-2         

CORE TASKS

Fine-grained Argument Understanding in Financial Analysis ("FinArg-1")

"Fine-grained Argument Mining in Financial Narratives"

Abstract:
Although argument mining has been discussed for several years, financial argument mining is still in the early stage. In FinNum-3, we proposed the concept for identifying the arguments in financial narratives. To perform a more fine-grained analysis, we propose an argument-based sentiment analysis task in FinArg-1. The idea is based on the concept that good news may not always lead to a bullish claim. In this task, we separate the analyst report and earnings conference calls into two parts: premise and claim, and further label the sentiment toward the argument. For the premise, the sentiment labels are positive/neutral/negative. For the claim, the sentiment labels are bullish/neutral/bearish. In this way, we can better understand the argumentation structure in professional reports and managers' presentations. On the other hand, the other task aims to identify the attack and support argumentative relationships in the social media discussion thread. Instead of analyzing a single social media post, we consider the whole discussion thread. In this task, we attempt to link the posts with attack and support labels. With these labels, we can understand the argumentation structure among opinions. FinArg-1 could also lead our community to discuss more fine-grained information embedded in the financial documents and discussions.

Website: https://sites.google.com/nlg.csie.ntu.edu.tw/finarg-1/
Kickoff slide[English]

Contact:

PAGE TOP


Personal Lifelog Organisation & Retrieval ("Lifelog-5")

"Multimedia retrieval for lifelog dataset"

Abstract:
The main objectives of this Lifelog-5 task are to encourage sustained collaborative research into ad-hoc retrieval and motivate research into the new topic of question answering from lifelogs.

Website: http://lifelogsearch.org/ntcir-lifelog/
Contact:

PAGE TOP


Medial Natural Language Processing for Social media and Clinical texts ("MedNLP-SC")

"Medical Natural Language Processing for Social media and Clinical texts"

Abstract:
Medical Natural Language Processing for Social media and Clinical texts (MedNLP-SC) aims to promote and support the development of practical medical NLP tools applicable in the hospital. To do so, we propose a series of MedNLP workshops aimed at various tasks and languages. The MedNLP-CS is the combination of the nature of the previous MedNLPs, while broadening the scope of languages handled. The Social Media Subtask is to identify a set of symptoms caused by a drug, which is called adverse drug event detection (shortly ADE), from social media texts in Japanese, English, French, and German. The Radiology Report Subtasks are two tasks using radiology reports in Japanese: (a) Named Entity Recognition (NER) and (b) TNM staging. The MedNLP-SC will be essential to develop core technologies of practical medical applications.

Website: https://sociocom.naist.jp/mednlp-sc
Kickoff slide[English]

Contact:

PAGE TOP


QA Lab for Political Information-4 ("QA Lab-PoliInfo-4")

"Resolving issues related to fake news detection and fact checking on political information"

Abstract:
QA Lab-PoliInfo-4 aims to solve four tasks (Question Answering, Answer Verification, Stance Classification, and Minutes-to-Budget Linking) in political issues. In the Question Answering subtask, we aim to generate a brief answer to a given question. In the Answer Verification subtask, we make classifiers that check whether answers are not fake, and also we make fake answers that confuse the classifiers. In the Stance Classification subtask, we estimate a politician's position from their utterances. In the Minutes-to-Budget Linking subtask, we aim to connect a budget item and the related discussion.

Website: https://sites.google.com/view/poliinfo4/
Kickoff slide[English]

Contact:

PAGE TOP


Session Search ("SS-2")

"Chinese Session Search"

Abstract:
We propose NTCIR-17 Session Search (SS) task to support in-depth investigations of session search or task-oriented search. Similar tasks such as TREC Session Tracks and Dynamic Domain (DD) Tracks have been terminated for years. To this end, we proposed Session Search (SS) task as a pilot task in NTCIR-16. As the second year of organizing SS, we still employ settings that support not only (1) large-scale practical session datasets for model training but also (2) both ad-hoc and session-level evaluation this year. We would update the testing set by collecting data via an upcoming field study. Besides the aforementioned settings, we would also involve a new subtask for participants to design better session-level search effectiveness evaluation metrics. We believe that this will facilitate the development of the IR community in the related domain.

Website: http://www.thuir.cn/session-search
Contact:

PAGE TOP



PILOT TASKS

FairWeb-1 ("FairWeb-1")

"For each topic with one or two attribute sets, return a SERP that contains relevant information AND is group-fair."

Abstract:
FairWeb-1 is an English web search task that considers not only relevance from the viewpoint of search engine users but also group fairness from the viewpoint of entities that are being sought. We consider four entity types: researchers (R), movies (M), Twitter accounts (T), and YouTube contents (Y). For each entity type, we have one or two attribute sets (i.e., sets of groups defined for considering group fairness), each with a target distribution. Runs will be evaluated with a suite of evaluation measures called GFR (Group Fairness and Relevance), which combines a relevance-based measure (e.g. ERR) with a group fairness measure. The latter compares the SERP’s achieved distribution over groups with the target distribution. FairWeb-1 considers both ordinal groups (e.g., researchers grouped by h-index) and nominal groups (e.g. gender), as well as intersectional group fairness. GFR features divergences appropriate for handling ordinal groups.

Website: http://sakailab.com/fairweb1/
Kickoff slide[English]

Contact:

PAGE TOP


Resource Transfer Based Dense Retrieval ("Transfer")

"Dense retrieval with resource transfer technologies"

Abstract:
Transfer Task aims to develop a suite of technology to transfer resources that were generated for one purpose to another in the context of dense retrieval.

Website: https://github.com/ntcirtransfer/transfer1/discussions
Kickoff slide[English]

Contact:


The Transfer task requires the preparation of agreement of Understanding for the use of the Test Collection for Task Participants. Please refer to the following page for details.
https://research.nii.ac.jp/ntcir/ntcir-17/agrmnt.html

PAGE TOP


Understanding of non-Financial Objects in Financial Reports ("UFO")

"Information extraction for tables and texts in annual securities reports"

Abstract:
UFO task aims to develop techniques for extracting structured information from tabular data and documents, especially focusing on annual securities reports. We provide the dataset based on ASRs as the training and test data, and investigate appropriate evaluation metrics and methodologies for the information extraction from the tabular data and documents as a joint effort of the participants.

Website: https://sites.google.com/view/ntcir17-ufo/
Kickoff slide[English]

Contact:

PAGE TOP



Unbiased Learning to Rank Evaluation Task 2 ("ULTRE-2")

"Evaluating the effectiveness and robustness of unbiased learning to rank models"

Abstract:
Unbiased learning to rank (ULTR) aims to train an unbiased ranking model with biased user behavior logs. Due to the difficulties in collecting and sharing large-scale behavior logs in online systems, the evaluation of ULTR models mainly relies on simulation experiments with synthetic click data. However, most existing simulation methods are rather simple and the synthetic data may not match the real-world scenarios. Although many ULTR models have achieved promising results on synthetic data, they still lack guarantees of effectiveness in real-world scenarios. In the ULTRE-2 task, we will evaluate the effectiveness of ULTR models with a new, large-scale user behavior log collected from a commercial Web search engine Baidu. In addition to the real click log, we also provide rich display information (e.g., displayed height and displayed abstract) and other user behavior information (e.g., dwelling time and slip count), enabling the development of more advanced ULTR models.

Website: http://ntcir17.ultre.online/

Contact:

PAGE TOP




Last Modified: 2023-05-15