タスク概要・参加者募集

第16回 NTCIR (2021 - 2022)
情報アクセス技術研究のためのテストベッドとコミュニティ
カンファレンス: 2022年 6月14日(火)~17日(金) 東京 学術総合センター


NTCIR-16 タスク参加のご案内:

参加申込の手引き

情報アクセス技術向上のための協同的な取り組みに参加してみませんか?

第16回目のNTCIR、NTCIR-16では、共通のデータセットを用いて研究するタスクへの参加チームを募集中です。
情報アクセス技術の評価には、研究者の協同作業の結果として作成される「テストコレクション」に基づく評価が欠かせません。NTCIRは、数多くの研究者の協力の下で、その評価基盤の形成に過去20年以上に渡って取り組み、技術の発展に貢献してきました。そして日々開発される新しい技術に対する評価手法を模索しつつ、活動を進めております。
情報アクセス分野の学生や若手研究者のみなさん,先生方,企業で研究をなさっている方,
または情報学に興味のある方々,大規模なテストコレクションを用いた検索、質問応答、自然言語処理に関心のある研究グループは、どなたでも歓迎します。 どうぞ、奮ってご参加ください。

参加登録はこちらをご覧ください:http://research.nii.ac.jp/ntcir/ntcir-16/howto-ja.html

*NTCIR-16では、オンライン発表が可能です。

NTCIRについて

評価タスク

第16回NTCIR(NTCIR-16)プログラム委員会は、以下の6つのコアタスクと4つのパイロットタスクを選定しました。
タスク紹介スライド(キックオフイベント)を下記のページからご覧いただけます:
タスク紹介スライド(キックオフイベント): http://research.nii.ac.jp/ntcir/ntcir-16/kickoff-ja.html
タスクの詳細・最新情報について、下記のタスク概要および各タスクのウェブサイトをご覧ください。

Data Search     DialEval-2     FinNum-3     Lifelog-4     QA Lab-PoliInfo-3     WWW-4    
RCIR     Real Med-NLP     SS     ULTRE    

コアタスク

Data Search 2 ("Data Search 2")

"統計データの検索と質問応答"

Abstract:
The Data Search 2 task focuses on the retrieval of a statistical data collection published by the Japanese government (e-Stat), and one published by the US government (Data.gov). In addition, we organize question answering and search interface subtasks in this round.

Website: https://ntcir.datasearch.jp/
Contact:

PAGE TOP


Dialogue Evaluation 2 ("DialEval-2")

"ヘルプデスク対話の品質推定、問題解決に向けた対話ターンの役割理解"

Abstract:
This task is a sequel to the NTCIR-14 Short Text Conversation DQ (Dialogue Quality) and Nugget Detection (ND) subtasks and the NTCIR-15 DialEval-1 task. The DQ subtask requires the participating systems to estimate the distribution of dialogue quality scores for a given dialogue. The ND subtask requires them to estimate the distribution of gold labels over nugget types. A nugget is an utterance by a helpdesk or a customer that helps the customer transition from the initial problem-facing state to the problem-solved state.

Website: http://sakailab.com/dialeval2/
Contact:

PAGE TOP


Investor’s and Manager’s Fine-grained Claim Detection ("FinNum-3")

"Investor’s and Manager’s Fine-grained Claim Detection"

Abstract:
In FinNum-1 and FinNum-2, we focus on understanding the numerals in financial social media data. The task of understanding the meaning of numeral (FinNum-1) and the numeral attachment issue (FinNum-2) are explored. In FinNum-3, we pay attention to formal documents (professional analyst's report and earnings conference call), and propose multilingual datasets (Chinese and English) for participants to explore on new numeral-related task, named fine-grained claim detection.

Website: https://sites.google.com/nlg.csie.ntu.edu.tw/finnum3/
Contact:

PAGE TOP


Lifelog Access and Retrieval ("Lifelog-4")

"個人マルチモダルライフログ情報検索の評価手法"

Abstract:
This core task aims to advance the state-of-the-art research in lifelogging as an application of information retrieval. The Lifelog Semantic Access Task (LSAT) is a known-item search task that can be undertaken in an interactive or automatic manner.

Website: http://ntcir-lifelog.computing.dcu.ie/
Contact:

PAGE TOP


Question Answering Lab for Political Information ("QA Lab-PoliInfo-3")

"質問応答、質問答弁のアライメント、事実確認に関するチェック、予算に関する議論マイニング"

Abstract:
QA Lab-PoliInfo-3 aims to solve four tasks Question Answering, QA Alignment, Fact Verification and Budget Argument Mining) in political issues. In the question answering subtask, we aim to generate a brief answer to a given question. In the QA Alignment subtask, we automatically align each question with its appropriate answer in the assembly minutes. In the Fact Verification subtask, we focus on (1) verifying whether a given speech summary that may be fake is true, and (2) finding utterances corresponding to the summary in assembly minutes if it is true. In the Budget Argument Mining subtask, we aims to connect a budget item and the related discussion.

Website: https://poliinfo3.net
Contact:

PAGE TOP


We Want Web 4 with CENTRE ("WWW-4")

"ウェブ検索の進歩と再現可能性の定量化"

Abstract:
This is an adhoc English web search task that tries to monitor technological advances in web search and to study replicability/reproducibility issues in IR evaluation. CENTRE stands for (CLEF/NTCIR/TREC Reproducibility): this was also a track/task at TREC 2018, CLEF 2018 and 2019, as well as NTCIR-14. CENTRE became part of the WWW-task at NTCIR-15.

Website: http://sakailab.com/www4/
Contact:

PAGE TOP



パイロットタスク

Reading Comprehension for Information Retrieval ("RCIR")

"Text reading signals in Information Retrieval"

Abstract:
The NTCIR-16 RCIR pilot task aims to motivate the development of a first generation of personalised retrieval techniques that integrate reading comprehension measures from biosignals as a source of evidence when ranking text content. Participating researchers will develop and benchmark approaches to integrate multi-modal signals (e.g. eye tracking, EOG, screenshots, etc) into the retrieval process for two sub-tasks, a comprehension-evaluation task (CET) that aims to sort texts in terms of comprehension levels, and a comprehension-based retrieval task (CRT) that aims to rank texts (for a variety of topics) by integrating comprehension-evidence into the IR process. Both sub-tasks are exploratory in nature, but designed to facilitate initial experimentation on the topic by the community. A new dataset will be generated by the organisers and will consist of textual data extracted from Wikipedia (the content texts) as well as a range of preprocessed biometric signal data. Runs will be ranked in terms of appropriate evaluation measures.

Website: http://ntcir-rcir.computing.dcu.ie/
Contact:

PAGE TOP


Real document-based Medical Natural Language Processing ("Real-MedNLP")

"実際の医療文書を用いた医療言語処理"

Abstract:
Recently, more and more medical records are written in electronic format in place of paper, which leads to a higher importance of information processing techniques in medical fields. However, the amount of privacy-free medical text data is still small in non-English languages, such as Japanese and Chinese. In such a situation, we had proposed a series of previous four medical natural language processing (MedNLP) tasks, MedNLP-1, MedNLP-2, MedNLPDoc, and MedWeb. However, we did not utilize the real data and relied only on dummy data. Specifically, dummy medical reports created by medical doctors were used for MedNLP-1 and MedNLP-2; textual medical records from a medical textbook were used for MedNLPDoc; dummy Twitter data were created and used for MedWeb. In this proposed pilot task, we re-design the scheme, which holds the following two core resources for medical AI tasks; (1) Case-Report dataset and (2) Radiographic-Report dataset. More importantly, we prepare the real data in Japanese and translate the original reports into English, enabling us to develop the first benchmark for multi-language medical NLP. This task will yield promising technologies to develop practical computational systems for supporting a wide range of medical services.

Website: https://sociocom.naist.jp/real-mednlp/
Contact:

PAGE TOP


Session Search ("SS")

"リアルデータに基づくWebセッション検索"

Abstract:
We propose this new task to support intensive investigations of session search or task- oriented search, namely NTCIR-16 Session Search (SS) task. Relevant tasks such as TREC Session Tracks and Dynamic Domain (DD) Tracks have terminated for years. However, how to optimize and further evaluate whole-session system performance is still challenging these days. As Session Tracks and DD tracks have their limitations, we project new settings that support (1) large-scale practical session datasets for model training, (2) both ad-hoc and session-level evaluation. We believe that the new task will facilitate the development of IR community in the related domain.

Website: http://www.thuir.cn/session-search/
Contact:

PAGE TOP


Unbiased Learning to Ranking Evaluation Task ("ULTRE")

"Evaluating unbiased learning-to-rank with user simulation"

Abstract:
​Unbiased learning to rank (ULTR) with biased user behavior data has received considerable attention in the IR community. However, how to properly evaluate and compare different ULTR approaches has not been systematically investigated and there is no shared task or benchmark that is specifically developed for ULTR. In this paper, we propose Unbiased Learning to Ranking Evaluation Task (ULTRE) as a pilot task in NTCIR 16. In ULTRE, we plan to design a user-simulation based evaluation protocol and implement an online benchmarking service for the training and evaluation of both offline and online ULTR models. We will also investigate questions of ULTR evaluation, particularly whether and how different user simulation models affect the evaluation results.

Website: TBA
Contact:

PAGE TOP



Last modified: 2021-06-03