Keynote 1
NTCIR-15 Conference Keynote 1
Date: December 9th (Wed), 2020
(Time: 11:00 - 12:00(JST), 2:00 - 3:00 (GMT), Dec 8, 21:00 - 22:00 (EST))
Title: From Offline to Online Experimentation: Considerations from Experiences at Spotify
Speaker: Ben Carterette (Spotify, ACM SIGIR chair)
The history of experimenting on information access systems using offline test collections---the Cranfield paradigm---goes back many decades and is a major aspect of scientific progress in search and IA. Its wide-scale adoption has been driven in part by its robustness and ease of use, in part by evaluation workshops like NTCIR, and in part by the emergence of new information access scenarios and problems that can adapt it. Despite that, there is a lot we still don't know about the ability of offline experiments to predict online outcomes with real users in real-world conditions. In this talk I discuss a common framework for thinking about experimentation and connect it to both offline Cranfield experiments and online A/B testing. Using examples from Spotify search and recommendation, I show how offline experiments motivate online development and vice versa. Developing a better understanding of how offline experiments translate into online experiences will be key as approaches from our research continue to be adopted into real-world technology.
Ben Carterette is a Senior Research Manager at Spotify, where he leads a team of research scientists investigating problems such as ML for search and recommendation, offline and online experimentation, user models of consumption and satisfaction, and models of music and podcast content. He was formerly an Associate Professor at the University of Delaware, where he maintains an affiliated position. He has published over a hundred papers in information retrieval and data mining venues and been co-author of five Best Paper Award-winning works. With collaborators, he has co-organized 12 TREC tracks on various topics, and has served as a standing member of the TREC PC and as an NTCIR PC member for several years. Dr. Carterette is currently serving as Chair of the ACM SIGIR Executive Committee.
NTCIR-15 Conference Keynote 2
Date: December 10th (Thu), 2020
(Time: 20:00 - 21:00 (JST), 11:00 - 12:00 (GMT), 6:00 - 7:00 (EST))
Title: Reproducibility, Replicability and Reliability: Reflections of a Statistician and a Data Science Editor
Speaker: Xiao-li Meng (the Whipple V. N. Jones Professor of Statistics, Harvard University, and the Founding Editor-in-Chief of Harvard Data Science Review)
The terms reproducibility and replicability have been used interchangeably by some scientific communities and by media, and with the opposite meanings by others, causing much confusion. The 2019 report on “Reproducibility and Replicability in Science” issued by US National Academies of Sciences, Engineering and Medicine (NASEM) made an important contribution to delineate the two terms by equating reproducibility with computational reproducibility and replicability with scientific replicability. However, neither of them in itself can guarantee reliability. Reliability does not imply absolute truth, but it does require that our findings can be triangulated, can pass reasonable stress tests and fair-minded sensitivity tests, and they do not contradict the best available theory and scientific understanding, unless the findings are designed to challenge the existing common wisdom. The quality of data and information plays far important roles than their quantity in ensuring reliability. This talk reflects on these issues based on my statistical research on quantifying quality of big data, and as the founding Editor-in-Chief of Harvard Data Science Review (HDSR), an experience that has provided me a much broader data science perspective. Along the way, using US election prediction and COVID-19 testing as two recent examples, I will demonstrate how small our big data are when we take into account their quality.
Xiao-Li Meng, the Whipple V. N. Jones Professor of Statistics, and the Founding Editor-in-Chief of Harvard Data Science Review, is well known for his depth and breadth in research, his innovation and passion in pedagogy, his vision and effectiveness in administration, as well as for his engaging and entertaining style as a speaker and writer. Meng was named the best statistician under the age of 40 by COPSS (Committee of Presidents of Statistical Societies) in 2001, and he is the recipient of numerous awards and honors for his more than 150 publications in at least a dozen theoretical and methodological areas, as well as in areas of pedagogy and professional development. In 2020, he was elected to the American Academy of Arts and Sciences. He has delivered more than 400 research presentations and public speeches on these topics, and he is the author of “The XL-Files," a thought-provoking and entertaining column in the IMS (Institute of Mathematical Statistics) Bulletin. His interests range from the theoretical foundations of statistical inferences (e.g., the interplay among Bayesian, Fiducial, and frequentist perspectives; frameworks for multi-source, multi-phase and multi- resolution inferences) to statistical methods and computation (e.g., posterior predictive p-value; EM algorithm; Markov chain Monte Carlo; bridge and path sampling) to applications in natural, social, and medical sciences and engineering (e.g., complex statistical modeling in astronomy and astrophysics, assessing disparity in mental health services, and quantifying statistical information in genetic studies). Meng received his BS in mathematics from Fudan University in 1982 and his PhD in statistics from Harvard in 1990. He was on the faculty of the University of Chicago from 1991 to 2001 before returning to Harvard, where he served as the Chair of the Department of Statistics (2004-2012) and the Dean of Graduate School of Arts and Sciences (2012-2017).
Date: December 11th (Fri), 2020 Title: TREC in 2020 Speaker: Ellen Voorhees (NIST, ACM Fellow)
To state the obvious, 2020 was an unusual year. The pandemic that caused so many changes large and small impacted the Text REtrieval Conference (TREC), too: TREC implemented its first extra-curricular track (TREC-COVID); both relevance assessing and the conference itself were required to be remote; and many existing TREC tracks either pivoted to be completely focused on COVID-19 or included some topics regarding it. This talk will describe implementing TREC in 2020 and highlight the outcomes of its eight tracks. Ellen Voorhees is a Senior Research Scientist at the US National Institute of Standards and Technology (NIST). Her primary responsibility at NIST is to manage the Text REtrieval Conference (TREC) project, a project that develops the infrastructure required for large-scale evaluation of search engines and other information access technology. Voorhees' research focuses on developing and validating appropriate evaluation schemes to measure system effectiveness for diverse user tasks. Date: December 11th (Fri), 2020 Title: What's happening in CLEF and what's the Covid-19 @ MLIA Initiative
Speaker: Nicola Ferro (University of Padua)
The initial part of this talk will discuss the achievements and
happenings in CLEF, the European initiative whose main mission is to
promote research, innovation, and development of information access
systems with an emphasis on multilingual and multimodal information with
various levels of structure. We will focus on the just concluded CLEF
2020 edition Nicola Ferro (
http://www.dei.unipd.it/~ferro/
) is full professor of computer science
at the University of Padua, Italy. His research interests include
information retrieval, its experimental evaluation, multilingual
information access and digital libraries and he published more than 350
papers on these topics. He is co-organizer of the Covid-19 MLIA @ Eval
initiative and he is the chair of the CLEF evaluation initiative,
which involves more than 200 research groups world-wide in large-scale
IR evaluation activities. He was the coordinator of the EU 7FP Network
of Excellence PROMISE on information retrieval evaluation. He is
associate editor of ACM TOIS and was general chair of ECIR 2016, and
short papers program co-chair of ECIR 2020. Date: December 11th (Fri), 2020 Title: MediaEval 2020 Multimedia Benchmarking Initiative Speaker: Gareth Jones (Dublin City University)
MediaEval is a multimedia benchmarking initiative which seeks to
evaluate new algorithms for multimedia access and retrieval. MediaEval
emphasizes the "multi" in multimedia, including tasks combining various
facet combinations of speech, audio, visual content, tags, users, and
context. MediaEval innovates new tasks and techniques focusing on the
human and social aspects of multimedia content in a community driven
setting. The initiative provides a platform for researchers to organize
benchmark tasks within a planned annual timeline and to report results
at an end of campaign workshop. This presentation will overview the
objectives of the MediaEval campaigns and summarize current activities
within MediaEval 2020. Gareth Jones conducts research on multiple topics in information
retrieval, including multimedia, multilingual and personal content
across a wide range of application areas. Over the last 20 years he has
published hundreds of papers describing this work at multiple venues.
Much of his research encompasses the design of tasks for the evaluation
of this research, including test collections and evaluation metrics.
Since 2002 he has been responsible for the organisation of international
benchmarking tasks at venues including CLEF, FIRE, NTCIR and TRECVid. In
2010, together with Martha Larson, Radboud University, The Netherlands,
he co-founded the MediaEval Multimedia Benchmarking initiative to
provide a platform for the development and evaluation of novel tasks in
multimedia indexing and search. Gareth has served as co-Programme Chair
for ECIR 2011, Information Retrieval Chair for ACM CIKM 2010, and
co-Chair of ACM SIGIR 2013 and CLEF 2017 (with MediaEval 2017) both
hosted in Dublin. Date: December 11th (Fri), 2020 Title: NTCIR-Lifelog - A Journey in Collaborative Benchmarking Speaker: Cathal Gurrin (Dublin City University) Technology advances mean that we can now gather detailed
multimedia traces that model our life activities; these are called
lifelogs. As research began in this domain over fifteen years ago, it
was noticeable that there was little understanding of how lifelogs can
positively impact on the individual/society nor did our research
community understand how lifelogs can be organised and indexed to
provide effective retrieval facilities. This talk will provide an
overview of the outputs of the NTCIR-Lifelog task. The motivation for
proposing the NTCIR-Lifelog task and the progress made by running this
task three times at NTCIR will be discussed along with highlighting the
wider impact of NTCIR-Lifelog in terms of related benchmarking
activities. Finally, future plans for next generation lifelog-related
tasks at NTCIR will be proposed along with a possible roadmap for the
future years. Cathal Gurrin is an associate professor and deputy department head
at the School of Computing, at Dublin City University (DCU), Ireland and
he is an investigator at the Insight Centre for Data Analytics and the
Adapt Centre, both at DCU. His research interests are personal analytics
and lifelogging, which integrate personal sensing, computer science,
cognitive science and data-driven healthcare analytics to realise the
next-generation of digital records for the individual. He regularly
speaks at Quantified Self events and his research been featured
internationally on Discovery Channel, BBC, NHK, as well as in the
Economist magazine, New York Times, among many others. He has been the
General Chair of ECIR 2011, MMM 2014, MB2016 and MMM2017, CBMI2019 &
ICMR2020. He is the author of Lifelogging: Personal Big Data from the
FNTIR series. Last modified: 2020-12-07
(Time: 17:05 - 17:15 (JST), 8:05 - 8:15 (GMT), 3:05 - 3:15 (EST))
Abstract:
Biography:
Voorhees received a B.Sc. in computer science from the Pennsylvania State University, and M.Sc. and Ph.D. degrees in computer science from Cornell University. Prior to joining NIST she was a Senior Member of Technical Staff at Siemens Corporate Research in Princeton, NJ where her work on intelligent agents applied to information access resulted in three patents. She is a fellow of the ACM, a member of AAAI, and has been elected as a fellow of the Washington Academy of Sciences. She has published numerous articles on information retrieval techniques and evaluation methodologies and serves on the review boards of several journals and conferences.
(Time: 17:15 - 17:25 (JST), 8:15 - 8:25 (GMT), 3:15 - 3:25 (EST))
Abstract:
(
https://clef2020.clef-initiative.eu/) and the just started CLEF 2021
edition
(
http://clef2021.clef-initiative.eu/).
The second part of the talk will present the Covid-10 MLIA @ Eval
initiative (
http://eval.covid19-mlia.eu/), a voluntary effort supported and
promoted by several communities, among which CLEF. Covid-19 MLIA @ Eval
is a community effort to boost the development of (language) resources
and Multilingual Information Access (MLIA) systems specifically tailored
on Covid-19. In particular, we organise evaluation tasks that steer the
development of systems and resources in the following areas: information
extraction, multilingual semantic search, and machine translation.Biography:
(Time: 17:25 - 17:35 (JST), 8:25 - 8:35 (GMT), 3:25 - 3:35 (EST))
Abstract:
Biography:
Gareth is a faculty member of the School of Computing, Dublin City
University (DCU), Ireland and a Principal Researcher in the SFI ADAPT
Centre. He holds B.Eng. and PhD degrees from the University of Bristol,
UK. He has previously held posts at the University of Cambridge and
University of Exeter, U.K., and in 1997 was a Toshiba Fellow at the
Toshiba Corporation Research and Development Center in Kawasaki, Japan.
(Time: 17:35 - 17:55 (JST), 8:35 - 8:55 (GMT), 3:35 - 3:55 (EST))
Abstract:
Biography: