Keynote 1
Date: DAY-2, June 11th (Wed), 2025 (Time: 10:15 - 11:15)
Title: Measuring the Generative Information Retrieval Universe
Speaker: Maarten de Rijke (University of Amsterdam)
Generative information retrieval is a promising neural retrieval paradigm that integrates all information in a corpus into a single, consolidated model. It formulates document retrieval as a document identifier (docid) generation task, allowing for end-to-end optimization toward a unified global retrieval objective. The generative information retrieval paradigm comes with a range of interesting evaluation questions. How can we gain insights in the inner workings of such end-to-end learnable pipelines? How do we know whether the retrieval model has "indexed" a corpus correctly? A generative retrieval model’s interconnected nature means that even small errors or changes in one component can lead to outsized impacts on overall performance, making the debugging process more complex and time-consuming. How do we establish some level of trustworthiness, in terms of reliability, resilience, and reproducibility? Typically formulated as a sequence-to-sequence learning problem, generative information retrieval lends naturally itself to combinations with a range of long-term optimization goals that go beyond short-term accuracy-based retrieval success. What are meaningful ways of probing and assessing generative information retrieval models that are being trained for long-term beyond-accuracy goals? In the talk I will present a range of evaluation challenges related generative information retrieval, mostly questions, with some early and partial answers.
Maarten de Rijke is a Distinguished University Professor of Artificial Intelligence and Information Retrieval at the University of Amsterdam. His research is focused on designing and evaluating trustworthy technology to connect people to information, particularly search engines, recommender systems, and conversational assistants. He is also co-founder and the scientific director of the Innovation Center for Artificial Intelligence (ICAI), a national collaboration between academic, industrial, governmental, and societal stakeholders aimed at talent development, research, and impact in AI.
Keynote 2
Date: DAY-4, June 13th (Fri), 2025 (Time: 13:30 - 14:30)
Title: Things We Know That Aren’t (Always) True
Speaker: Douglas W. Oard (University of Maryland)
Those of us who work on Information Retrieval (IR) share several common perspectives that shape our work. First, we assume that the searcher knows what they are looking for. Well, we know that’s not always true, but it’s been a useful assumption. Second, we assume that what we’re searching for has some digital representation, whether it was born digital, digitized, or simply described using digital metadata. The NTCIR-18 SUSHI task this year makes it clear that’s not always true, but that’s not the first such case. Third, we assume that if an IR system can find things, that its job is to provide them. But of course there are cases in which some searchers should not see some things. Fourth, we assume that the searcher can recognize what they are looking for; that’s why we often cast our task as creating a ranked list. But we know that’s not always true, as in (for example) spoken interaction, where ranked lists aren’t too helpful. Fifth, we expect that the searcher will be satisfied once they have found what they are looking for. But we know from our work on Cross-Language IR, for example, that there are cases in which we can find things that the searcher simply can’t make sense of. In this talk, I will argue that there is much to be learned from these kinds of cases where our common perspectives break down, and I will suggest that in our new era of generative large language models it can be useful to think broadly about these kinds of questions.
Douglas W. Oard is a Professor and the incoming Interim Dean at the University of Maryland College of Information. Over his three decades of research in information retrieval he has worked on cross-language IR, speech retrieval, and search among sensitive content, among other “off the beaten path” topics; essentially this is a talk that has been 30 years in the making. And as a Visiting Professor at NII since 2014, he has been pleased to contribute to the development of NTCIR as one of the world’s premiere evaluation venues for information access technologies.
Invited Talks
NTCIR-18 Conference Inviated talk 1
Date: DAY-4, June 13th (Fri), 2025 (Time: 14:30 - )
Title: TREC (provisional)
Speaker: Ian Soboroff (NIST, USA)
NTCIR-18 Conference Inviated talk 2
Date: DAY-4, June 13th (Fri), 2025 (Time: 14:30 - )
Title: MediaEval 2025 Multimedia Benchmarking Initiative
Speaker: Gareth Jones (Dublin City University)
MediaEval is a multimedia benchmarking initiative which seeks to evaluate new algorithms for multimedia access and retrieval. MediaEval emphasizes the "multi" in multimedia, including tasks combining various facet combinations of speech, audio, visual content, tags, users, and context. MediaEval innovates new tasks and techniques focusing on the human and social aspects of multimedia content in a community driven setting. The initiative provides a platform for researchers to organize benchmark tasks within a planned annual timeline and to report results at an end of campaign workshop. MediaEval 2025 marks the 16th anniversary of the foundation of MediaEval. This presentation will briefly outline the goals and logistics of the MediaEval campaigns and introduce the tasks running at MediaEval 2025.
Gareth Jones conducts research on multiple topics in information retrieval, including multimedia, multilingual and personal content across a wide range of application areas. Over the last 20 years he has published hundreds of papers describing this work at multiple venues. Much of his research encompasses the design of tasks for the evaluation of this research, including test collections and evaluation metrics. Since 2002 he has been responsible for the organisation of international benchmarking tasks at venues including CLEF, FIRE, NTCIR TREC and TRECVid. In 2010, together with Martha Larson, Radboud University, The Netherlands, he co-founded the MediaEval Multimedia Benchmarking initiative to provide a platform for the development and evaluation of novel tasks in multimedia indexing and search. Gareth has served as co-Programme Chair for ECIR 2011, Information Retrieval Chair for ACM CIKM 2010, and co-Chair of ACM SIGIR 2013, CLEF 2017 (with MediaEval 2017) and Interspeech 2023 all hosted in Dublin.
Gareth is a faculty member of the School of Computing, Dublin City University (DCU), Ireland and a Principal Investigator in the Research Ireland ADAPT Centre. He holds B.Eng. and PhD degrees from the University of Bristol, UK. He has previously held posts at the University of Cambridge and University of Exeter, U.K., and in 1997 was a Toshiba Fellow at the Toshiba Corporation Research and Development Center in Kawasaki, Japan.
Panel
Date: DAY-2, June 11th (Wed), 2025 (Time: 16:30 - 17:30)
Title: LLMs and offline test collections: a dangerous distraction or a vital new tool?
Moderator: Mark Sanderson (RMIT University)
Panelists: Charlie Clarke (University of Waterloo, Canada),
Mark Sanderson (RMIT University, Australia),
Ian Soboroff (NIST, USA) and
Rikiya Takehi (Waseda University, Japan)
The research field of building offline test collections was turned on its head 2 1/2 years ago when researchers started to show that LLMs had the potential to judge the relevance of documents for an offline testing data set. This discovery prompted a flurry of research both utilising LLMs in this new role as well as investigating the abilities and the limits of LLMs. This panel will both reflect on the existing research to use LLMs for relevance and to discuss other potential uses of this powerful modelling system for creating or augmenting other parts of offline testing.
The panel will consider the following three topics.
・Using LLMs for relevance
・Using LLMs to generate queries
・Using LLMs for other aspects of offline evaluation, simulation of user interactions, etc
Last modified: 2025-06-09