Keynote
Date: December 6th (Wed), 2017 (Time: 10:00 a.m. - 11:00 a.m.)
Location: Hitotsubashi-hall, NII, Tokyo, Japan
Title: The practice of crowdsourcing: things to know about using humans and machines for labeling
Speaker: Omar Alonso (Microsoft)
Many data science applications that use machine learning techniques depend on humans providing the initial data set so algorithms can process the rest or to evaluate the performance of such algorithms. Not only can labeled data for training and evaluation be collected faster, cheaper, and easier than ever before, but we now see the emergence of novel infrastructure that combines computations performed by humans and machines. Building these labeling pipelines remain difficult and these difficulties need to be addressed by practitioners and researchers to advance the state of the art. In this talk, I’ll outline things that work in practice and describe a number of trade-offs when designing and implementing computation systems that use humans and machines.
Omar is a Principal Data Scientist Lead at Microsoft in Silicon Valley where he works on the intersection of social networks, temporal information, knowledge graphs, and human computation. He has shipped many features for Bing and other Microsoft properties. He is the co-chair for the new IR system-oriented conference, called DESIRES.
EVIA 2017 Keynote
Date: Dec 5th (Tues), 2017 (Time: 13:00 p.m. - 13:50 p.m.)
Location: Hitotsubashi Conference Room 2-4, NII, Tokyo, Japan
Title: Cross-Language Information Retrieval in the MATERIAL Program
Speaker: Douglas W. Oard (University of Maryland, USA)
In this talk I will describe a research program called MAchine Translation for English Retrieval of Information in Any Language (MATERIAL) that includes a substantial focus on Cross Language Information Retrieval (CLIR). Over four years, this program expects to build new CLIR test collections for ten new languages, in each case with English queries. Novel aspects of these test collections will include (1) domain-limited, sense-specific, and morphology-specific queries, and (2) mixed collections including both text and speech. Two novel aspects of the evaluation design are a focus on set-based rather than ranked retrieval, and the use of a linear utility measure for evaluating result set selection. The MATERIAL program also includes an interactive CLIR evaluation in which assessors use system-generated English summaries in an effort to identify the truly relevant documents in the result set. In this talk I will start by walking through these evaluation design issues, and then I will offer our initial thoughts on the consequences of these evaluation choices for our system designs. Additional information on the MATERIAL program is available at https://www.iarpa.gov/index.php/research-programs/material.
Douglas Oard is a Professor at the University of Maryland, College Park (USA), with joint appointments there in the College of Information Studies (Maryland’s iSchool) and the University of Maryland Institute for Advanced Computer Studies (UMIACS). He is also a visiting professor at the National Institute of Informatics (Japan). Dr. Oard’s research interests center around the use of emerging technologies to support information seeking by end users. Additional information is available at http://terpconnect.umd.edu/~oard/.
NTCIR 20th Anniversary Session
NTCIR 20th Anniversary Session: Invited Talk 1
Date: December 6th (Wed), 2017 (Time: 18:00 p.m. - 18:30 p.m.)
Location: Hitotsubashi-hall, NII, Tokyo, Japan
Title: NTCIR from the Beginning: A Personal Research Journey
Speaker: Fredric C Gey (University of California, Berkeley)
My participation in the first NTCIR workshop starting in 1997 led to a fifteen year research journey. It was a journey through Kanji, Katakana, Hiragana, segmentation, bigrams, phonetic recognition, decompounding and parallel corpora alignment for lexicon development. In languages the travels went through Japanese, Chinese, Korean, European languages, Russian, Arabic, and Hindi, as well as mathematics. It produced three SIGIR workshops (2002, 2006, 2009) on cross-language search and multilingual information access as well as a special issue of Information Processing and Management on cross language information retrieval. In later years the research direction changed toward geographic and geo-temporal information retrieval evaluation. This talk will cover highlights of my personal research journey and pay tribute to colleagues and students with whom I have been fortunate enough to collaborate.
After receiving a Master's degree in Mathematics from UC Berkeley n 1964, Fredric Gey worked for 3 1/2 years at Bell Laboratories (later ATT Labs Research). In 1967 he returned to Berkeley and worked for 21 years as a staff scientist at Lawrence Berkeley Laboratory in the Computer Science Research Department. In 1989 he returned to the Berkeley campus as Data Archivist and Librarian for Social Science and Health Statistics for UC Berkeley, while simultaneously pursuing a PhD in Information Science, which was conferred in 1993. His dissertation "Probabilistic Dependence and Logistic Inference in Information Retrieval" developed the first of several logistic regression ranking models which have stood the test of time. In 1996 he received a USA National Science Foundation Grant to develop logistic regression search models. In 1998 he turned the research direction of this grant toward cross-language information retrieval, leading to participation in the first NTCIR workshop in 1997-1999, and in 2000 to CLEF, the Cross Language Evaluation Forum for European languages. In addition to multilingual information access, he has also done research and development in nuclear forensics, geographic information retrieval, digital humanities and social science information systems. He was the General Chair of ACM-SIGIR 1999, the 22nd International Conference on Research and Development in Information Retrieval.
NTCIR 20th Anniversary Session: Invited Talk 2
Date: December 6th (Wed), 2017 (Time: 18:30 p.m. - 19:00 p.m.)
Location: Hitotsubashi-hall, NII, Tokyo, Japan
Title: NTCIR in the World: Two Decades of Impact
Speaker: Douglas W. Oard (University of Maryland, USA)
In the last part of the twentieth century, the idea of shared task evaluation emerged as a significant force shaping information retrieval research. In this talk, I will start by tracing the evolution of that idea at it moved around the world from its inception at Cambridge through its incubation at the Text Retrieval Conference, to its present incarnation at NTCIR. I’ll then look back over the history of NTCIR to highlight some of the impactful and innovative evaluation tasks that have been invented there, exploring the impact of each from national, regional, or global perspectives. I’ll wrap up with a few remarks on how the role of shared task evaluation in information retrieval research is evolving today, and what that might suggest for the future impact of NTCIR.
Douglas Oard is a Professor at the University of Maryland, College Park (USA), with joint appointments there in the College of Information Studies (Maryland’s iSchool) and the University of Maryland Institute for Advanced Computer Studies (UMIACS). He is also a visiting professor at the National Institute of Informatics (Japan) and a former general chair of NTCIR. Dr. Oard’s research interests center around the use of emerging technologies to support information seeking by end users. Additional information is available at http://terpconnect.umd.edu/~oard/.
Date: December 8th (Friday), 2017 (Time: 13:50 p.m. - 14:20 p.m.)
Location: Hitotsubashi Conference Room 1 & 2, NII, Tokyo, Japan
Title: New Tracks for TREC 2018
Speaker: Ian Soboroff (National Institute of Standards and Technology, NIST)
This talk will, time permitting, highlight new and interesting evaluation activities hosted at the US National Institute of Standards and Technology (NIST).
Dr. Ian Soboroff is the leader of the Retrieval Group at NIST. His research is in the area of IR test collections and their experimental limits.
Date: December 8th (Friday), 2017 (Time: 13:50 p.m. - 14:20 p.m.)
Location: Hitotsubashi Conference Room 1 & 2, NII, Tokyo, Japan
Title: CLEF 2018: Evaluation Labs, Conference, and Initiative
Speaker: Nicola Ferro (Department of Information Engineering, University of Padua, Italy)
Starting with 2010, a radical renewal and innovation process is taking place in the Cross Language Evaluation Forum (CLEF). CLEF became an independent event constituted by (i) Evaluation Labs, i.e. laboratories to conduct evaluation of information access systems and workshops to discuss and pilot innovative evaluation activities; (ii) a peer-reviewed Conference on a broad range of issues, including investigation continuing the activities of the Evaluation Labs; experiments using multilingual and multimodal data; in particular, but not only, data resulting from CLEF activities; and, research in evaluation methodologies and challenges. This process led to the establishment of the CLEF Initiative with a charter describing its scope and aims and new organizational structure. This talk will thus discuss the achievements and happenings in CLEF 2018, the activities conducted in the Evaluation Labs, the discussions held during the Conference, and the new organization of CLEF.
Nicola Ferro is associate professor in computer science at the University of Padua, Italy. His research interests include information retrieval, its experimental evaluation, multilingual information access and digital libraries. He is the coordinator of the CLEF evaluation initiative, which involves more than 200 research groups world-wide in large-scale IR evaluation activities. He was the coordinator of the EU Seventh Framework Programme Network of Excellence PROMISE on information retrieval evaluation
Last modified: 2017-11-30