Invited Talks

Speaker: Qingyao Ai, Tsinghua University, China

Title: Asking Clarifying Questions with Large Language Models

Abstract Traditional search engines often struggle with query ambiguity due to underspecified or faceted queries. By leveraging LLMs, conversational search can dynamically generate and refine questions that address user intent more effectively. In this talk, we discuss the challenges and advancements in the generation of clarifying questions using large language models (LLMs) to improve conversational search systems. Key areas of focus include overcoming cold-start problems, mitigating data biases, and optimizing the utility of generated questions. Specifically, we introduce our recent work on constrained generation, knowledge augmentation, and aspect extraction based on top retrieved documents to generate clarifying questions with LLMs for conversational search, which have demonstrated promising results in enhancing question quality and retrieval performance. The findings highlight the pivotal role of LLMs in the future development of more intuitive and effective conversational search systems.

Biography

Qingyao Ai is an assistant professor at the Department of Computer Science and Technology, Tsinghua University. His research mainly focuses on solving IR problems with machine learning techniques. He has extensively worked on the construction of intelligent IR systems with deep neural networks, unbiased learning theories, and large language models. Qingyao Ai has served as the general co-chair of SIGIR-AP 2023, the program co-chair of NTCIR-18, the senior PC member or area chair for SIGIR, CIKM, WWW, EMNLP, COLING, etc. He has published more than a hundred papers in IR conferences and journals since 2017, and has received multiple awards including ACM SIGIR Early Researcher Award, Google Research Scholar Award, ACM SIGIR 2024 Best Paper Award, SIGIR-AP’23 Best Paper Honorable Mention, etc.

Speaker: Mohammad Aliannejadi, University of Amsterdam, The Netherlands

Title: LLMs and IR Evaluation: Towards Building Reusable Conversational Test Collectionss

Abstract The IR community is moving fast towards leveraging LLMs in the evaluation pipeline: from LLM-assisted human annotation to fully LLM-based evaluation. Although LLMs provide countless opportunities in this area, there are various reasonable concerns about such a fast-paced transition of the field, mainly concerning their reliability as well as transferring their learned bias in the evaluation phase. In this talk, I will review the recent advances in LLM-based evaluation and discuss the potential risks and opportunities in this area. I will describe the observations we have had as part of the TREC Interactive Knowledge Assistance Track (iKAT), where we observe that LLMs can be used to fill missing judgments in the human-assessed relevance pool. Furthermore, I will give an overview of our findings in the LLMJudge challenge, where we organized a shared task on LLM-based relevance assessment and welcomed a diverse set of work by the community. Finally, I will discuss approaches to nugget-based generated text evaluation aimed at facilitating reusable evaluation of generated content.

Biography

Mohammad is an Assistant Professor at IRLab, University of Amsterdam. His research interests include conversational information access, evaluation, and recommender systems. He is an active member of the IR community, regularly publishing and serving as an (S)PC in top venues such as SIGIR, CIKM, WSDM, TheWebConf, ECIR, ACL, EMNLP, NAACL, and others. Mohammad has co-organized various evaluation campaigns, including TREC CAsT, TREC iKAT, LLMJudge, ConvAI3, XMRec, and IGLU, focusing on diverse aspects of user interaction with conversational agents and recommender systems.

Speaker: Johanne Trippas, RMIT University, Australia

Title: Re-evaluating the Command-and-Control Paradigm in Conversational Search Interactions

Biography

Dr. Johanne Trippas is a Vice-Chancellor’s Research Fellow at RMIT University, Australia. Her work focuses on developing next-generation capabilities for intelligent systems, including spoken conversational search, digital assistants in cockpits, and artificial intelligence to identify cardiac arrests. Dr. Trippas is particularly interested in how conversational systems can revolutionise information seeking, especially through generative interactive information retrieval and novel interfaces beyond traditional text search. She has extensive experience analysing human information-seeking behaviour and developing novel approaches to personalised intelligent assistance, data-driven modelling, and profiling human behaviours. Dr. Trippas employs many research methods and consistently adopts user-centric data capture and analysis approaches, focusing on modelling and profiling human behaviours. Additionally, she is a member of the NIST TREC program committee, ACM CHIIR steering committee member, and SIGIR Artifact Evaluation Committee Vice-chair. Dr. Trippas serves on numerous information retrieval events, such as SWRIL’25’, resource & reproducibility chair (ACM SIGIR’25), workshop chair (ACM CHIIR’25), and technical program chair (ACM CHIIR’26).