[Date Prev][Date Next][Date Index]

[ntcir:188] Seminar on 2-Stage Expert Search by Hang Li, MSRA


You are welcome to the semimar on "a Two-stage Model
for Expert Search" by Dr Hang Li, MSRA.  Please join!
  Please feel free to circualte the announcment for
interested people.             ---    Noriko Kando.

A Two-Stage Model for Expert Search

Hang Li
Microsoft Research Asia, China, PRC

Monday, March 13, 2006;  14:00-15:00
Lecture Room 1 12th Floor, NII
(12? 寄????片1)

In this talk, I will introduce our work on expert search, a search task
where the user types a query representing a topic and the search system
returns a ranked list of people who are considered experts on the topic. 
Previous studies employed profile-based methods, where the expert 
ranking is based only on co-occurrence between people and terms in 
documents. We propose an approach capable of employing many types of 
association relationships among query terms, documents and people 
(experts). These include relevance between query terms and
documents, co-occurrence between people and terms in documents,
co-occurrence between people and terms in authors and title fields, 
and co-occurrence between people and people. We employ a new 
statistical model, referred to as the two-stage expert search model, 
to combine all the association information in a unified and theoretically 
sound way. The two-stage model consists of two parts:
relevance model and co-occurrence model. The relevance model 
characterizes the relevance of documents to queries. The co-occurrence 
model characterizes the co-occurrence between people and terms 
(i.e., queries) in various types. The co-occurrence model is further 
decomposed into sub-models, each representing one type of co-occurrence. 
We used the data in TREC 2005 expert search task and the data from 
an industrial research lab to verify the effectiveness of our proposal. 
Our experimental results show that the two-stage model can
significantly outperform the profile-based method. The results indicate
that it is helpful to incorporate of new types of association 
information, including document relevance and person-to-person 

Hang Li is a researcher and project leader at
Microsoft Research Asia. He is also adjunct professor at Peking University,
Xian Jiaotong University and Nankai University. His research interests
include statistical language learning, natural language processing, 
information retrieval, and data mining. He earned a PhD in computer 
science from the University of Tokyo. Hang has many publications in 
internal journals and conferences. He is in editorial board of 
`Journal for Computer and Science Technology¨ and `Computational 
Linguistics and Chinese Language Processing¨. His recent academic 
activities include area chair of ACL¨05 and program committee 
member of IJCAI¨05. Hang has been working on development of
several text mining systems or tools. These include NEC TopicScope,
Microsoft internal tool: TextMiner, Microsoft SQL Server Text Mining,
and Office 2006 metadata extraction.