NTCIR (NII Test Collection for IR Systems) Project bNTCIRbCONTACT INFORMATIONbNIIb




NTCIR-8 Meeting
Invited Talks

NTCIR-8 Workshop Meeting
Invited Talk

June 16 Wednesday, 10:30 - 11:30

Title: "Building Watson: A Grand Challenge in Automatic NL Question Answering"
David A. Ferrucci (IBM Research) and Koichi Takeda (IBM Research)

Deep QA project:

Extended Abstract: On April 27, 2009, IBM unveiled the details of a project for building an advanced computing system that will be able to compete with humans at the game of Jeopardy! Computer systems that can directly and accurately answer peoples' questions over a broad domain of human knowledge have been envisioned by scientists and writers since the advent of computers themselves. Consider, for example, the computer on Star Trek ? how it understands questions and quickly provides accurate, customized answers and can engage in a fluent information seeking dialog with the user. We call this technology open domain question answering and it has tremendous promise for impacting society and business. Applications in business intelligence, health care, customer support, enterprise knowledge management, social computing, science and government would all benefit from such technology. The Project Watson is addressing a grand challenge in Computer Science aimed at illustrating how the integration and advancement of Natural Language Processing (NLP), Information Retrieval (IR), Machine Learning (ML), massively parallel computation and Knowledge Representation and Reasoning (KR&R) can advance open-domain automatic Question Answering to a point where it clearly and consistently rivals the best human performance. An exciting proof-point in this challenge is to develop a computer system that can successfully compete against top human players at the well-known Jeopardy! quiz show. Attaining champion-level performance at the game of Jeopardy! requires a computer system to rapidly and accurately answer challenging open-domain questions, and to predict its own performance on any given category/question. The system must deliver high degrees of precision and confidence over a very broad domain with a 3 second response time. It is highly unlikely that any system will be able to clearly justify all the answer with perfect certainty over such a broad range of natural language questions and content. Computing accurate confidences is an important requirement for determining when to gbuzz inh against your competitors and how much to bet. While critical for winning the game, high precision and accurate confidence computations are just as critical for a QA system to provide real value in business settings. The need for speed and for very high precision demands a massively parallel compute platform capable of generating and evaluating 1000fs of hypotheses and their associated evidence. In this talk we will introduce the audience to the Jeopardy! Challenge and describe our technical approach and progress on this grand-challenge problem.

Biography: Dr. David Ferrucci
is a Research Staff Member and Department Group Manager at IBMfs T.J. Watsonfs Research Center where he leads the Semantic Analysis and Integration department. His teamfs mission is focused on natural language and semantic processing for discovering relevant knowledge in text-based sources and leveraging the results in a wide range of, intelligent discovery and information management solutions. Currently, Dr. Ferrucci is the Principal Investigator(PI) for the DeepQA (www.ibm.com/deepQA) project, an exploratory research project focused on advanced Automatic Question Answering and building Watson-- the computer system that will challenge top ranking Jeopardy! players. The advances he and his team of 25 world-class NLP, IR, ML and Knowledge Representation researchers and software engineers have achieved promise to advance a wide range of content analysis and knowledge discovery applications in business intelligence, healthcare, customer support, enterprise knowledge management, social computing, science and government. As chief architect for UIMA, Dr., Ferrucci led the design of scalable software architecture and framework that provides a foundation for integrating and accelerating advanced text and multi-modal analytics for a broad range of content analysis solutions. The UIMA framework has been contributed to open-source (http://incubator.apache.org/uima/index.html) and UIMA has recently become a interoperability standard under his chairmanship www.oasis-open.org/committees/uima. UIMA's us and influence continues to grow particularly in medical and government applications. UIMA also and provides the technical infrastructure for the DeepQA project. Dr. Ferrucci is also the Co-PI on DARPA's Machine Reading Program where he is leading a broad team of IBM and university scientists in developing systems capable of extracting complex and rich knowledge representations from reading natural language text documents. Throughout Dr. Ferruccifs academic and professional career he has focused on designing computer systems to help people discover, represent, integrate, manage, and apply real-world knowledge. Dr. Ferrucci is published in the field of logic and knowledge representation, architectures for natural language engineering, document configuration, automatic question answering and story generation.

Dr. Koichi Takeda
is a Senior Technical Staff Member at IBM Research ? Tokyo, where he leads text and speech technology groups. Since joining IBM in 1983, he has worked on many projects in the natural language processing, information visualization, and text mining. He was a visiting scientist at the Carnegie-Mellon University (CMU) in 1987-1989, assigned for the CMU-IBM joint project on English-Japanese knowledge-based machine translation. He then proposed a pattern-based machine translation approach for Internet Web pages, which was developed into IBM Japanfs first machine translation product in 1996. In 2002, he led the text mining project for analyzing the entire MEDLINE citations of biomedical journals, and successfully implemented the system jointly with Celestar Lexico-Sciences, Inc. His current interests include insight discovery from various structured/unstructured information sources, in particular, electronic medical records for healthcare analytics, and aggregation of information by applying question answering techniques. He is a member of the global DeepQA project.

EVIA 2010
Invited Talk

June 15 Tuesday, 13:00

Title: "Microsoft's Bing and User Behavior Evaluation"
John Nave (Principal Development Manager, Search Technology Center, Microsoft)

Abstract: Microsoftfs Bing takes a different approach to web search. Why do we call Bing a gdecision engineh and what are the unique benefits this approach provides to users? This talk will describe the vision and design principles of Bing and how this translates into what we build. Behind the code there is a great deal of customer research and a few fundamental insights that motivate our designs. Evaluation of user behaviors suggest that there are many areas were they could be getting much more benefit from a web search engine. The vision for Bing is global in nature, but what about searchers in Japan? We will cover the areas where searchers in here appear similar to searchers worldwide, and also touch on areas where their behaviors are unique.

Biography: John Nave is the Principal development Manager in Microsoft Search Technology Center Japan, leading Bing development for Japanese market. He has been developing and shipping many products that apply technology to bring benefit to Microsoftfs customers for the last 14 years. The very first of these products was a Japanese tokenizer, followed by natural language products for several languages. With extensive experience in Japan language processing and Japanese market, John is now focused on understanding the needs of customers in Japan and bringing new and more useful solutions to this market. Prior to Microsoft, John worked in finance and software start-up businesses in Japan and the US. He is a graduate of the University of Washington, and attended Keio University as a Monbusho invited scholar. (Many years ago!)

NTCIR-8 Workshop Meeting
Invited Talk - Report Out from other evaluations

June 17 Thursday, 9:15-10:00

Title: "CLEF, CLEF 2010, and PROMISEs:
Perspectives for the Cross-Language Evaluation Forum"
Nicola Ferro (the University of Padua, Italy)

June 17 Thursday, 9:15-9:35

Abstract: After ten years of increasingly successful evaluation campaigns, the Cross-Language Evaluation Forum (CLEF) has come to an appropriate moment to assess what has been achieved in this decade and also to consider future directions and how to renew and complement it.

This talk will provide a brief summary of the most significant results achieved by CLEF in the past ten years, it will describe the new format and organization for CLEF which is being experimented for the first time in CLEF 2010, and it will discuss why scientific data should play a central role in the design and planning of an evaluation campaign and how large-scale evaluation campaigns could adopt interoperable infrastructures to foster the sharing and re-use of such data.

Biography: Nicola Ferro is assistant professor in Computer Science at the Department of Information Engineering and at the Faculty of Statistical Sciences of the University of Padua, Italy. He teaches the courses on Digital Libraries, Information Retrieval, and Databases. He received a Ph.D. degree in Computer Science from University of Padua in 2005. He holds a Laurea degree from University of Padua in Telecommunications Engineering. His main research interests are digital libraries and archives, their architectures, interoperability, and evaluation, as well as multilingual information access and its evaluation. He is and has been involved in the overall coordination of the CLEF (Cross Language Evaluation Forum) evaluation campaigns since 2005. He is scientific leader of the DL.org working group on quality in digital libraries. He is programme co-chair of the CLEF 2010 Conference on Multilingual and Multimedia Information Access Evaluation. He has participated and participates in several national and international projects among which Europeana Connect (multilingual information access services for Europeana and their evaluation), Europeana v 1.0 (multilinguality and annotations in the Europeana Data Model), TrebleCLEF (best practices, collaboration, and evaluation for multilingual information access systems), TELplus (enhanchement of The European Library portal towards Europeana), SAPIR (search in audio visual content using peer-to-peer information retrieval), and DELOS (the European network of excellence on Digital Libraries). He has published more than 60 papers on digital library architectures, interoperability, and services; multilingual information access and its experimental evaluation; the management of the scientific data produced during evaluation campaigns. He is member of ACM and IEEE.

Title: "ClueWeb09 and TREC Diversity"
Charles Clarke (University of Waterloo, Canada)

June 17 Thursday, 9:35-9:55

Abstract: The TREC Web Track explores and evaluates Web retrieval technologies.  The TREC 2009 Web Track included both a traditional adhoc retrieval task and a new diversity task.  The goal of this diversity task is to return a ranked list of pages that together provide complete coverage for a query, while avoiding excessive redundancy in the result list.  Both tasks will continue at TREC 2010, which will also include a new Web spam task.  The track uses the ClueWeb09 dataset as its document collection.  This collection consists of roughly 1 billion web pages in multiple languages, comprising approximately 25TB of uncompressed data crawled from the general Web during January and February 2009.

For TREC 2009, topics for the track were created from the logs of a commercial search engine, with the aid of tools developed at Microsoft Research.  Given a target query, these tools extracted and analyzed groups of related queries, using co-clicks and other information, to identify clusters of queries that highlight different aspects and interpretations of the target query.  These clusters were employed by NIST for topic development.  For use by the diversity task, each resulting topic is structured as a representative set of subtopics, each related to a different user need.  Documents were judged with respect to the subtopics, as well as with respect to the topic as a whole.

In 2009, a total of 18 groups submitted runs to the diversity task.  To evaluate these runs, the task used two primary effectiveness measures: ?-nDCG as defined by Clarke et al. (SIGIR 2008) and an gintent awareh version of precision, based on the work of Agrawal et al. (WSDM 2009).  Developing and validating metrics for diversity tasks continues to be a goal of the track.  For TREC 2010, we will report a number of additional evaluation measures that have been proposed over the past year, including an intent aware version of the ERR measure described by Chapelle et al. (CIKM 2009).

Nick Craswell from Microsoft serves as the track co-coordinator.  Ian Soboroff is the NIST contact.  The ClueWeb09 collection was created through the efforts of Jamie Callan and Mark Hoy at the Language Technologies Institute, Carnegie Mellon University.  More information may be found on the track Web page: http://plg.uwaterloo.ca/~trecweb/2010.html.

Biography: Charles Clarke is a professor in the David R. Cheriton School of Computer Science at the University of Waterloo, Canada.  He has published on a wide range of topics within the area of information retrieval, including papers related to evaluation, efficiency, ranking, parallel systems, security, question answering, document structure, and XML. He was a Program Co-Chair of SIGIR 2007 and General Co-Chair of SIGIR 2003.  From 2004 to 2006 he was the coordinator of the TREC Terabyte Retrieval track.  Since 2009 he has been a co-coordinator of the TREC Web Track.  He is a co-author of the book Information Retrieval: Implementing and Evaluating Search Engines (MIT Press, 2010). He has previously held software development positions at a number of computer consulting and engineering firms.  In 2006 he spent a sabbatical at Microsoft, where he was involved in their search engine development effort.

Title:"E-Commerce Data through Rakuten Data Challenge"
Masahiro Sanjo (Rakuten Institute of Technology) and
Satoshi Sekine (Rakuten Institute of Technology, New York/New York University)

June 17 Thursday, 9:55-10:00

Abstract: Rakuten, the Japanese largest shopping site, will distribute its data to
academia for research purpose. The data includes the followings:
1) market item data and item homepage data
2) Hotel data and its review data
3) Golf course data and its review data
We are planning to hold Rekuten R&D Symposium in January, 2011, where one of the sessions will be dedicated to the R&D activities using the data.
The data is planned to be distributed through ALAGIN and NII-IDR in July, 2010.

Biography: Masahiro Sanjo graduated Tokyo univ. in 2003. Currently he is the senior-technologist of Rakuten Institute of Technology.@He is interested in the technology of image analysis as the internet services.

Satoshi Sekine received Ph.D. at New York University in 1994. Currently he is the director of
Rakuten Institute of Technology, New York, as well as Associate Research Professor at New York University. He is interested various fields in NLP, including Information Extraction, Linguistic Knowledge Acquisition, Language Analysis and so on.

We are looking forward to seeing you soon!

Last updated: June 07, 2010