HomeResearchPublicationsProjectsResourcesTeachingGroup

Associate Professor Nigel Collier, Principals of Informatics Research Division, National Institute of Informatics

Japanese

Nigel Collier (PhD. UMIST, 1996)
Associate Professor, National Institute of Informatics and the University for Advanced Studies

SHORT BIOGRAPHY

I am associate professor in the Principals of Informatics division at the National Institute for Informatics (NII) in Japan.  Before coming to NII I received a B.Sc. in Computer Science from Leeds University (UK) in 1992, an M.Sc. in Machine Translation from UMIST (UK) in 1994 and a Ph.D. in Language Engineering from UMIST (now merged with Manchester University) in 1996.  From 1996 to 1998 I was a Toshiba Fellow working in Toshiba's human interface laboratories on machine translation, and from 1998 to 2000 I worked on information extraction in the molecular-biology domain at the Tsujii laboratory of the University of Tokyo as a JSPS research associate.  I am a member of various organizations including ACL, IEEE Computer Society, ACM and IPSJ. @

RESEARCH


My research interests include natural language processing (NLP) - primarily using empirical methods, machine learning of NL knowledge from corpora,  and artificial intelligence. In the last ten years my work has focussed largely on bridging the gap between unstructured text and actionable data using intelligent text mining to support faster and more informed decision making by experts with a particular application focus on biomedical and health professionals.

Since 2006 I have been leading an international project group involving five institutes building a high-throughput multilingual bio-surveillance system called BioCaster for detecting rumours about infectious disease outbreaks from very large scale Web-data.

In earlier work: (2004-2006) the ZAISA project involved looking at rhetorical zone annotation in scientific texts and how this can contribute to text mining; (2000-2004) the PIA project allowed me to explore the interaction between shallow semantics commonly used in text mining with deep semantic representations held in ontologies; (1998-2000) I coordinated the GENIA project at Tokyo University which contributed tools and annotated data sets to support life scientists in locating experimental results in the vast quantity of published literature appearing on MEDLINE.

Some recent publications are:

  • Nigel Collier, Son Doan, Ai Kawazeo, Reiko Matsuda Goodwin, Mike Conway, Yoshio Tateno, Quoc Hung Ngo, Dinh Dien, Asanee Kawtrakul, Koichi Takeuchi, Mika Shigematsu, Kiyosu Taniguchi (2008), "BioCaster: detecting public health rumors with a Web-based text mining system", Bioinformatics, Oxford University Press, DOI: 10.1093/bioinformatics/btn534. [pdf]
  • Ai Kawazoe, Hutchatai Chanlekha, Mika Shigematsu and Nigel Collier (2008), gStructuring an event ontology for disease outbreak detectionh, in BMC Bioinformatics, 9 (Suppl 3): S8, DOI: 10.1186/1471-2105-9-S3-S8. [HTML][pdf]
  • John McCrae and Nigel Collier (2008), gSynonym set extraction from the biomedical literature by lexical discoveryh, in BMC Bioinformatics, 9:159, DOI: 10.1186/1471-2105-9-159.[HTML][pdf]
  • Nigel Collier, Ai Kawazoe, Lihua Jin, Mika Shigematsu, Dinh Dien, Roberto Barrero, Koichi Takeuchi, Asanee Kawtrakul (2007), "A multilingual ontology for infectious disease outbreak surveillance: rationale, design and challenges", Journal of Language Resources and Evaluation, Springer, vol. 40, no. 3-4, pp.405-413. DOI information: 10.1007/s10579-007-9019-7 [pdf]

You can find out more at:

Publications (or see our work on Google Scholar)

For prospective PhD students please look here, and for researchers interested in JSPS Postdoctoral Fellowships see the information here.