タスク概要・参加者募集

第17回 NTCIR (2022 - 2023)
情報アクセス技術研究のためのテストベッドとコミュニティ
カンファレンス: 2023年 12月12日(火)~15日(金) 東京 学術総合センター


NTCIR-17 タスク参加のご案内:

参加申込の手引き

情報アクセス技術向上のための協同的な取り組みに参加してみませんか?

第17回目のNTCIR、NTCIR-17では、共通のデータセットを用いて研究するタスクへの参加チームを募集中です。
情報アクセス技術の評価には、研究者の協同作業の結果として作成される「テストコレクション」に基づく評価が欠かせません。NTCIRは、数多くの研究者の協力の下で、その評価基盤の形成に過去20年以上に渡って取り組み、技術の発展に貢献してきました。そして日々開発される新しい技術に対する評価手法を模索しつつ、活動を進めております。
情報アクセス分野の学生や若手研究者のみなさん,先生方,企業で研究をなさっている方,
または情報学に興味のある方々,大規模なテストコレクションを用いた検索、質問応答、自然言語処理に関心のある研究グループは、どなたでも歓迎します。 どうぞ、奮ってご参加ください。

参加登録はこちらをご覧ください:http://research.nii.ac.jp/ntcir/ntcir-17/howto-ja.html

*NTCIR-17では、オンライン発表が可能です。

NTCIRについて

評価タスク

第17回NTCIR (NTCIR-17) プログラム委員会は、以下の3つのコアタスクと3つのパイロットタスクを選定しました。
タスク紹介スライド (キックオフイベント) を下記のページからご覧いただけます:
タスク紹介スライド (キックオフイベント) : http://research.nii.ac.jp/ntcir/ntcir-17/kickoff-ja.html
タスクの詳細・最新情報について、下記のタスク概要および各タスクのウェブサイトをご覧ください。

FinArg-1     Lifelog-5     MedNLP-SC     QA Lab-PoliInfo-4     SS-2    
FairWeb-1     Transfer     UFO     ULTRE-2    

コアタスク

Fine-grained Argument Understanding in Financial Analysis ("FinArg-1")

"Fine-grained Argument Mining in Financial Narratives"

Abstract:
Although argument mining has been discussed for several years, financial argument mining is still in the early stage. In FinNum-3, we proposed the concept for identifying the arguments in financial narratives. To perform a more fine-grained analysis, we propose an argument-based sentiment analysis task in FinArg-1. The idea is based on the concept that good news may not always lead to a bullish claim. In this task, we separate the analyst report and earnings conference calls into two parts: premise and claim, and further label the sentiment toward the argument. For the premise, the sentiment labels are positive/neutral/negative. For the claim, the sentiment labels are bullish/neutral/bearish. In this way, we can better understand the argumentation structure in professional reports and managers' presentations. On the other hand, the other task aims to identify the attack and support argumentative relationships in the social media discussion thread. Instead of analyzing a single social media post, we consider the whole discussion thread. In this task, we attempt to link the posts with attack and support labels. With these labels, we can understand the argumentation structure among opinions. FinArg-1 could also lead our community to discuss more fine-grained information embedded in the financial documents and discussions.

Website: https://sites.google.com/nlg.csie.ntu.edu.tw/finarg-1/
Kickoff slide[English]

Contact:

PAGE TOP


Personal Lifelog Organisation & Retrieval ("Lifelog-5")

"Multimedia retrieval for lifelog dataset"

Abstract:
The main objectives of this Lifelog-5 task are to encourage sustained collaborative research into ad-hoc retrieval and motivate research into the new topic of question answering from lifelogs.

Website: http://lifelogsearch.org/ntcir-lifelog/
Contact:

PAGE TOP


Medial Natural Language Processing for Social media and Clinical texts ("MedNLP-SC")

"ソーシャルメディアと医療テキストのための医療言語処理"

Abstract:
Medical Natural Language Processing for Social media and Clinical texts (MedNLP-SC) aims to promote and support the development of practical medical NLP tools applicable in the hospital. To do so, we propose a series of MedNLP workshops aimed at various tasks and languages. The MedNLP-CS is the combination of the nature of the previous MedNLPs, while broadening the scope of languages handled. The Social Media Subtask is to identify a set of symptoms caused by a drug, which is called adverse drug event detection (shortly ADE), from social media texts in Japanese, English, French, and German. The Radiology Report Subtasks are two tasks using radiology reports in Japanese: (a) Named Entity Recognition (NER) and (b) TNM staging. The MedNLP-SC will be essential to develop core technologies of practical medical applications.

Website: https://sociocom.naist.jp/mednlp-sc
Kickoff slide[English]

Contact:

PAGE TOP


QA Lab for Political Information-4 ("QA Lab-PoliInfo-4")

"政治情報に関するフェイクニュース検出やファクトチェッキング"

Abstract:
QA Lab-PoliInfo-4 aims to solve four tasks (Question Answering, Answer Verification, Stance Classification, and Minutes-to-Budget Linking) in political issues. In the Question Answering subtask, we aim to generate a brief answer to a given question. In the Answer Verification subtask, we make classifiers that check whether answers are not fake, and also we make fake answers that confuse the classifiers. In the Stance Classification subtask, we estimate a politician's position from their utterances. In the Minutes-to-Budget Linking subtask, we aim to connect a budget item and the related discussion.

Website: https://sites.google.com/view/poliinfo4/
Kickoff slide[English]

Contact:

PAGE TOP


Session Search ("SS-2")

"Chinese Session Search"

Abstract:
We propose NTCIR-17 Session Search (SS) task to support in-depth investigations of session search or task-oriented search. Similar tasks such as TREC Session Tracks and Dynamic Domain (DD) Tracks have been terminated for years. To this end, we proposed Session Search (SS) task as a pilot task in NTCIR-16. As the second year of organizing SS, we still employ settings that support not only (1) large-scale practical session datasets for model training but also (2) both ad-hoc and session-level evaluation this year. We would update the testing set by collecting data via an upcoming field study. Besides the aforementioned settings, we would also involve a new subtask for participants to design better session-level search effectiveness evaluation metrics. We believe that this will facilitate the development of the IR community in the related domain.

Website: http://www.thuir.cn/session-search
Contact:

PAGE TOP


パイロットタスク

FairWeb-1 ("FairWeb-1")

"ひとつもしくはふたつの属性集合を伴う各トピックに対し、適合する情報を含みかつグループフェアな検索結果を返す。"

Abstract:
FairWeb-1 is an English web search task that considers not only relevance from the viewpoint of search engine users but also group fairness from the viewpoint of entities that are being sought. We consider four entity types: researchers (R), movies (M), Twitter accounts (T), and YouTube contents (Y). For each entity type, we have one or two attribute sets (i.e., sets of groups defined for considering group fairness), each with a target distribution. Runs will be evaluated with a suite of evaluation measures called GFR (Group Fairness and Relevance), which combines a relevance-based measure (e.g. ERR) with a group fairness measure. The latter compares the SERP’s achieved distribution over groups with the target distribution. FairWeb-1 considers both ordinal groups (e.g., researchers grouped by h-index) and nominal groups (e.g. gender), as well as intersectional group fairness. GFR features divergences appropriate for handling ordinal groups.

Website: http://sakailab.com/fairweb1/
Kickoff slide[English]

Contact:

PAGE TOP


Resource Transfer Based Dense Retrieval ("Transfer")

"資源横断技術を用いた密検索"

Abstract:
Transfer Task aims to develop a suite of technology to transfer resources that were generated for one purpose to another in the context of dense retrieval.

Website: https://github.com/ntcirtransfer/transfer1/discussions
Kickoff slide[English]

Contact:


Transferタスクでは、タスク参加者用テストコレクション利用許諾に関する覚書を用意していただく必要があります。詳しくは以下のページをご参照ください。
https://research.nii.ac.jp/ntcir/ntcir-17/agrmnt-ja.html

PAGE TOP


Understanding of non-Financial Objects in Financial Reports ("UFO")

"有価証券報告書の表とテキストを対象とした情報抽出"

Abstract:
UFO task aims to develop techniques for extracting structured information from tabular data and documents, especially focusing on annual securities reports. We provide the dataset based on ASRs as the training and test data, and investigate appropriate evaluation metrics and methodologies for the information extraction from the tabular data and documents as a joint effort of the participants.

Website: https://sites.google.com/view/ntcir17-ufo/
Kickoff slide[English]

Contact:


PAGE TOP



Unbiased Learning to Rank Evaluation Task 2 ("ULTRE-2")

"Evaluating the effectiveness and robustness of unbiased learning to rank models"

Abstract:
Unbiased learning to rank (ULTR) aims to train an unbiased ranking model with biased user behavior logs. Due to the difficulties in collecting and sharing large-scale behavior logs in online systems, the evaluation of ULTR models mainly relies on simulation experiments with synthetic click data. However, most existing simulation methods are rather simple and the synthetic data may not match the real-world scenarios. Although many ULTR models have achieved promising results on synthetic data, they still lack guarantees of effectiveness in real-world scenarios. In the ULTRE-2 task, we will evaluate the effectiveness of ULTR models with a new, large-scale user behavior log collected from a commercial Web search engine Baidu. In addition to the real click log, we also provide rich display information (e.g., displayed height and displayed abstract) and other user behavior information (e.g., dwelling time and slip count), enabling the development of more advanced ULTR models.

Website: http://ntcir17.ultre.online/

Contact:

PAGE TOP




Last modified: 2023-05-10