The Twelveth NTCIR (NTCIR-12) Task Selection Committee has selected the
following six Core Tasks and three Pilot Tasks.
For details and latest information, please visit each task’s homepage.
For contact to each task, please see CONTACT US page, too.
IMine MedNLPDoc MobileClick SpokenQuery&Doc Temporalia MathIR Lifelog QA Lab STC
CORE TASKS
The NTCIR-12 IMine-2 Task aims to explore and evaluate the technologies of understanding user intents behind the query and satisfying different user intents. The scope of IMine-2 is highly related to search result diversification and federated search, both of them are being actively studied in IR community and commercial search engines. For more information, please visit: http://www.dl.kuis.kyoto-u.ac.jp/imine2/
Recently, medical records are increasingly written on electronic media
instead of on paper, thereby increasing the importance of information
processing in medical fields. In this proposed core challenge task,
participants are supposed to assigning a suitable diagnosis and the
corresponding disease code to a clinical case in Japanese. Since this
task setting can be formalized as labeling disease name to a medical
document by utilizing various natural language processing
technologies, we call this task MedNLPDoc. Achievements of this task
can be almost directly applied to actual applications both in daily
clinical service and in clinical study.
Mobile devices have become quite popular as a means of Web search. However,
accessing search results is heavily constrained by the small screens of
mobile devices and ordinary situations involving mobile search users. Thus,
the traditional "ten blue link paradigm" for mobile users is
not considered as appropriate as that for desktop users. In the MobileClick
task, participants are required to generate a two-layered summary in response
to a given query that fits screens of mobile devices instead of ten blue
links. This task is devised to try to satisfy mobile searchers directly
and immediately without requiring them to scan a list of search results.
In this round, we have two subtasks: iUnit ranking and iUnit summarization.
NTCIR-12 SpokenQuery&Doc-2 task evaluates spoken document retrieval
from spontaneously spoken query. Current information retrieval framework
seems to face bottleneck in its human interface for drawing out one's information
need. SpokenQuery&Doc task tries to overcome it by making use of spontaneously
spoken queries. One of the advantage of the use of speech as an input method
to retrieval systems is that it enables users to easily submit long queries
to give systems rich clues for retrieval, because the unconstrained speech
is common in daily use for human and the most natural and easy method to
express one's thought. The target document collection is also spoken documents.
For more information, please visit: http://www.nlp.cs.tut.ac.jp/ntcir12/
The objective of this task is to foster research in temporal information access. Given the fact that time plays crucial role in estimating information relevance and validity we believe that successful search engines must consider temporal aspects of information in greater detail. Based on our achievements at NTCIR-11, we set technical challenges of Temporalia-2 into the following subtasks: Temporal Intent Disambiguation (TID) Subtask and Temporally Diversified Retrieval (TDR) Subtask.
Website: http://ntcirtemporalia.github.io/
Facebook: https://www.facebook.com/ntcirtemporalia
Twitter: https://twitter.com/ntcirtemporalia
The NTCIR-12 Math-IR Task aims to develop a test collection for evaluating retrieval using queries comprised of keywords and formulae, in order to facilitate and encourage research in mathematical information retrieval (MIR) and its related fields.
http://ntcir-math.nii.ac.jp/
PILOT TASK
This pilot Lifelog task aims to begin the comparative evaluation of information access and retrieval systems operating over personal lifelog data. This task will consist of two subtasks, both (or either) of which can be participated in independently. The two tasks are:
Lifelog Semantic Access Task (LSAT), a known-item search task that can be undertaken in an interactive or automatic manner. A number of real-world information needs will form the topics for both automatic and interactive runs. In interactive runs, participating groups will be given a time limit per topic to complete the topics. Results are submitted by the due date and evaluated by the organisers.
Lifelog Insight Task (LIT), an exploratory task that is concerned with knowledge mining from lifelogs. Participants are requested to develop tools and interfaces that discover insights about the lifelog data, support the lifelogger in the act of reflecting upon the data, facilitate filtering and provide for efficient/effective means of visualisation of the data. The outputs of this task will presented in demo or workshop format and no submission of results for evaluation is planned.
A multimodal dataset (training and test) will be gathered and distributed to the participants. This will consist of anonymised lifelog data from a number of individuals over an extended period of time. Accompanying this lifelog data (images from wearable cameras) will be visual concepts from each image as well as semantically rich metadata, such as semantic locations and semantic user activities).
Facebook: https://www.facebook.com/NTCIRLifelog
Twitter: https://twitter.com/NTCIRLifelog
Website: http://ntcir-lifelog.computing.dcu.ie/
The goal is investigate the real-world complex Question Answering (QA) technologies using Japanese university entrance exams and their English translation on the subject of "World History (世界史)". The questions were selected from two different stages - The National Center Test for University Admissions (センター試験, multiple choice-type questions) and from secondary exams at multiple universities (二次試験, complex questions including essays). All the questions are provided in an XML format.
Some of the highlights are:
1. Solving real-world problems.
2. Many questions require an understanding of the surrounding context.
3. Some questions require inference.
4. Encourage the investigation on each question types, including complex essay (長くて論点が複数ある論述文), simple essay (短い論述文), factoid (事実型), slot-filling (穴埋め), true-false (真偽判定), etc.
5. Good venue to investigate specific answer types (e.g. person-politician, person-religious), advanced entity-focused passage retrieval, enhance knowledge resources, semantic representation and sophisticated learning.
As knowledge resources, 4 sets of high school textbook and wikipedia will be provided. Participants can use any other resources (need to report). Two open-source baseline QA systems and one passage retrieval systems are also provided. Tests will be done in two phases - in the first phase, the question types are explicitly provided and the participants allow to work on specific question type(s) only. The evaluation results are analyzed according to the types.
・Open Advancement: We encourage each participant to work with own purpose(s) on end-to-end system, on particular question types and/or component(s) either of the QA platform provided or own system, or to build any resources/tools usable to improve QA systems for entrance exams.
・Evaluating continuous progress and Enhance the knowledge resources: The organizers run all the components contributed from participants periodically to see the progress.
・Forum: We place emphasis on building a community by bridging different communities.
Task Website: http://research.nii.ac.jp/qalab/
Task Overview
Natural language conversation between human and computer is one of the most challenging AI problems, which involves language understanding, reasoning, and the use of common sense knowledge. Despite a significant amount of effort on the research in the past decades, the progress on the problem is unfortunately quite limited. One of the major reasons for that is lack of large volume of real conversation data.
In this task, we consider a much simplified version of the problem: one round of conversation formed by two short texts, with the former being an initial post from a user and the latter being a comment given by the computer. We refer to it as short text conversation (STC). Thanks to the extremely large amount of short text conversation data available on social media such as Twitter and Weibo, we anticipate that significant progress could be made in the research on the problem with the use of the big data, much like what has happened in machine translation, community question answering, etc.
Task Definition
As the first step, short text conversation (STC) is defined as an IR task, i.e., retrieval-based STC. A repository of post-comment pairs from Sina Weibo is prepared. Each participating team receives the repository in advance.
1. In the training period, they can build their own conversation system based on IR technologies, using the given post-comment pairs as training data.
2. In the test period (one week), each team is given 50-100 test queries (posts), that have been held out from the repository. Each team is asked to provide a ranked list of ten results (comments) for each query. The comments must be those from the repository.
3. In the evaluation period, the results from all the participating teams are pooled and labeled. Graded relevance IR measures are used for evaluation.
The original Web texts are in Chinese and we provide word segmentation results. Furthermore, to help non-Chinese participants, we provide English translations of the original texts using machine translation. Non-native speakers can get a rough idea of the content from the translations and can still participate in the task.
Last Modified: 2015-07-24