NTCIR-10 Abstracts

Preface from NTCIR-10 General Chairs
Noriko Kando, Tsuneaki Kato, Douglas W. Oard and Mark Sanderson
[Pdf] [Table of Content]
Overview of NTCIR-10
Hideo Joho and Tetsuya Sakai
[Pdf] [Table of Content]

This is an overview of NTCIR-10, the tenth sesquiannual workshop for the evaluation of Information Access technologies. NTCIR-10 celebrates its tenth cycle of the research activities attracting the largest and most diverse set of evaluation tasks led by cutting-edge researchers worldwide. This paper presents a brief history of NTCIR and overall statistics of NTCIR-10, followed by an introduction of eight evaluation tasks. We conclude the paper by discussing the future directions of NTCIR. Readers should refer to individual task overview papers for their activities and findings.
Knowledge in Search
Shashidhar (Shashi) Thakur
[Pdf] [Table of Content]

Search engines are evolving from being a complex algorithm that matches user queries to web documents to one that also answers user questions directly. Having structured Knowledge about things in the world plays an important part towards this goal of answering questions. This talk will discuss how Google search has being going through a radical transformation in this direction. We will talk about some of the technologies behind the Google Knowledge Graph and its applications in search.
Time-Biased Gain
Charles Clarke
[Pdf] [Table of Content]

Time-biased gain provides a unifying framework for information retrieval evaluation, generalizing many traditional effectiveness measures while accommodating aspects of user behavior not captured by these measures. By using time as a basis for calibration against actual user data, time-biased gain can reflect aspects of the search process that directly impact user experience, including document length, near-duplicate documents, and summaries. Unlike traditional measures, which must be arbitrarily normalized for averaging purposes, time-biased gain is reported in meaningful units, such as the total number of relevant documents seen by the user. In work reported at SIGIR 2012, we proposed and validated a closed-form equation for estimating time-biased gain, explored its properties, and compared it to standard approaches. In work reported at CIKM 2012, we used stochastic simulation to numerically approximate time-biased gain, an approach that provides greater flexibility, allowing us to accommodate different types of user behavior and increases the realism of the effectiveness measure. In work reported at HCIR 2012, we extended our stochastic simulation to model the variation between users. In this talk, I will provide an overview of time-biased gain, and outline our ongoing and future work, including extensions to evaluate query suggestion, diversity, and whole-page relevance. This is joint work with Mark Smucker.
Overview of the NTCIR-10 Cross-Lingual Link Discovery Task
Ling-Xiang Tang, In-Su Kang, Fuminori Kimura, Yi-Hsun Lee, Andrew Trotman, Shlomo Geva and Yue Xu
[Pdf] [Table of Content]

This paper presents an overview of NTCIR-10 Cross-lingual Link Discovery (CrossLink-2) task. For the task, we continued using the evaluation framework developed for the NTCIR-9 CrossLink-1 task. Overall, recommended links were evaluated at two levels (file-to-file and anchor-to-file); and system performance was evaluated with metrics: LMAP, R-Prec and P@N.
Simple Yet Effective Methods for Cross-Lingual Link Discovery (CLLD) - KMI @ NTCIR-10 CrossLink-2
Petr Knoth and Drahomira Herrmannova
[Pdf] [Table of Content]

Cross-Lingual Link Discovery (CLLD) aims to automatically find links between documents written in different languages. In this paper, we first present a relatively simple yet effective methods for CLLD in Wiki collections, explaining the findings that motivated their design. Our methods (team KMI) achieved in the NTCIR-10 CrossLink-2 evaluation the best overall results in the English to Chinese, Japanese and Korean (E2CJK) task and were the top performers in the Chinese, Japanese, Korean to English task (CJK2E) [Tang et al.,2013]. Though tested on these language combinations, the methods are language agnostic and can be easily applied to any other language combination with sufficient corpora and available pre-processing tools. In the second part of the paper, we provide an in-depth analysis of the nature of the task, the evaluation metrics and the impact of the system components on the overall CLLD performance. We believe a good understanding of these aspects is the key to improving CLLD systems in the future.
Osaka Kyoiku University at NTCIR-10 CrossLink-2: Link Filtering by Title Tag of Corpus as a Dictionary
Takashi Sato
[Pdf] [Table of Content]

Our group (OKSAT) submitted two types of runs named SMP and REF for every subtasks of NTCIR-10 Cross-lingual Link Discovery (CLLD). Our method uses titles in Wikipedia pages (corpus) of source language as a entries of a dictionary, so no external dictionary is required. For SMP, we aimed to discover cross-lingual links of actual Wikipedia, in other words it targets Wikipedia ground truth. For REF, on the other hand, we aimed to discover as much meaningful cross-lingual links as possible automatically.
KECIR at NTCIR-10 Cross-Lingual Link Discovery Task
Jianxi Zheng, Yu Bai, Cheng Guo and Dongfeng Cai
[Pdf] [Table of Content]

This paper presents the methods of KECIR at NTCIR-10 Cross-Lingual Link Discovery Task. Two architectures of systems were designed, both of which consist of three common modules such as anchor detection, anchor translation and link discovery. In KECIR_A2F_C2E_03_FSCLIR and KECIR_A2F_C2E_04_FSCLIR, monolingual link discovery module is considered. In order to detect anchor, feature selection method are used. In the processing of anchor translation, we use a method combined with existing cross language link and Google translation web service. For discovering link, both title and paragraph matching methods are exploited to retrieve the relevant link corresponding to each anchor. Moreover, four runs were submitted, and in the A2F evaluation with Manual Assessment results, the KECIR_A2F_C2E_01_FSCLIR achieved the highest score of LMAP and R-Prec in Chinese to English task. The experiment shows that CrossLink based on the first architecture of system can retrieve higher precision links for an anchor than the second one, and anchors with noisiness will result in lower values of metrics in F2F evaluation.
UKP at CrossLink2: CJK-to-English Subtasks
Jungi Kim and Iryna Gurevych
[Pdf] [Table of Content]

This paper describes UKP's participation in the cross-lingual link discovery task at NTCIR-10 (CrossLink2). The task addressed in our work is to find valid anchor texts from a Chinese, Japanese, and Korean (CJK) Wikipedia page and retrieve the corresponding target Wiki pages in the English language. The CrossLink framework was developed based on our previous CrossLink system that works on the opposite directions of the language pairs, i.e. discovered anchor texts from English Wikipedia pages and their corresponding targets in CJK languages. The framework consists of anchor selection, anchor ranking, anchor translation, and target discovery sub-modules. Each sub-module in the framework has been shown to work well both in monolingual settings and English to CJK language pairs. We seek to find out whether the approach that worked very well for English to CJK would still work for CJK to English. We use the same experimental settings that were used in our previous participation, and our experimental runs show that the CJK-to-English CrossLink task is a much harder task when using the same resources as the English-to-CJK one.
NTHU at NTCIR-10 CrossLink-2: An Approach toward Semantic Features
Yu-Lan Liu, Joanne Boisson and Jason S. Chang
[Pdf] [Table of Content]

This paper describes the approaches of NTHU in the NTCIR-10 Cross-Lingual Link Discovery task, also named CrossLink-2. In this task, we aim to discover valuable anchors in Chinese, Japanese or Korean (CJK) articles and to link these anchors to related English Wikipedia pages. To achieve the objective, we not only depends on Wikipedia's distinguishing features (e.g. anchor links information and language links) but also developed a method that analyzes the semantic features of anchor texts in Chinese Wikipedia. In the linking phase, a Latent Dirichlet Allocation model (LDA) is used for the computation of a text similarity measure among the English Wikipedia articles. This novel approach to address the word-to-links ambiguity issue shows encouraging result in the CrossLink-2 evaluation.
Cross-lingual Link Discovery Based on CRF Model for NTCIR-10 CrossLink
Liang-Pu Chen, Yu-Lun Shih, Chien-Ting Chen, Ping-Che Yang, Hung-Sheng Chiu and Ren-Dar Yang
[Pdf] [Table of Content]

This paper described our participation in the NTCIR-10 Cross-lingual Link Discovery Task of Chinese-to-English(C2E). The task is focused on making sutiable links on terms between Chinese/Japanese/Korean lingual Wikipedia articles and English Wikipedia articles. In this event, we proposed a method on Chinese-to-English subtask. The method that we proposed have two stage. We divides this task into 'Anchor Recognition' and 'CrossLink'. The first one, we use conditional random field in machine learning method to recognize every potential anchors which could be linking to a article in target language. The second, we try to find candidate links of these anchors and then doing disambiguous with them. According to the offical result, our system achieved LMAP score 0.072 when evaluating with Wikipedia ground-truth, and 0.027 with manual assessment.
DCU at NTCIR-10 Cross-lingual Link Discovery (CrossLink-2) Task
Shu Chen, Gareth J. F. Jones and Noel E. O'Connor
[Pdf] [Table of Content]

DCU participated in the English to Chinese (C2E) and Chinese to English (C2E) subtasks of the NTCIR 10 CrossLink 2 Cross-lingual Link Discovery (CLLD) task. Our strategy for each query involved extracting potential link anchors as n-gram strings, cleaning of potential anchors strings, and anchor expansion and ranking to select a set of anchors for the query. Potential anchors were translated using Google Translate, and a standard information retrieval technique to create links between anchors and the top 5 items to be linked. We submitted a total of four runs for E2C CLLD and C2E CLLD. We describe our method and results for file-to-file level and anchor-to-file level evaluation.
A Single-step Machine Learning Approach to Link Detection in Wikipedia: NTCIR Crosslink-2 Experiments at KSLP
In-Su Kang and Sin-Jae Kang
[Pdf] [Table of Content]

This study describes a link detection method to find relevant cross-lingual links from Korean Wikipedia documents to English ones at term level. Earlier wikification approaches have used two independent steps for link disambiguation and link determination. This study seeks to merge these two separate steps into a single-step machine learning scheme. Our method at NTCIR-10 Korean-to-English CLLD task showed promising results.
RDLL at CrossLink Anchor Extraction Considering Ambiguity in CLLD
Fuminori Kimura, Kensuke Horita, Yuuki Konishi, Hisato Harada and Akira Maeda
[Pdf] [Table of Content]

This paper describes our work in NTCIR-10 on the task of Cross-Lingual Link Discovery (CLLD). Our proposed method mainly focuses on two aspects in order to accomplish this task: (1) how to find important anchors from the original article for Crosslink and (2) how to find the correct links to articles in target language for the original articles. The system firstly uses online data collecting in Japanese Wikipedia articles in order to build the basic Crosslink database. These data will be applied to identify the anchors and find out the relevant corresponding English articles. We carried out this task in three steps. First, we parse the Japanese articles and extract the candidate anchors. Second, we rank anchors based on weights of their importance. Third, we determine the correct English articles for each anchor. However, our results were not good.
NTCIR-10 CrossLink-2 Task: A Link Mining Strategy
Ling-Xiang Tang, Andrew Trotman, Shlomo Geva and Yue Xu
[Pdf] [Table of Content]

At NTCIR-10 we participated in the cross-lingual link discovery (CrossLink-2) task. In this paper we describe our systems for discovering cross-lingual links between the Chinese, Japanese, and Korean (CJK) Wikipedia and the English Wikipedia. The evaluation results show that our implementation of the cross-lingual linking method achieved promising results.
Overview of the NTCIR-10 INTENT-2 Task
Tetsuya Sakai, Zhicheng Dou, Takehiro Yamamoto, Yiqun Liu, Min Zhang, Ruihua Song
[Pdf] [Table of Content]

This paper provides an overview of the NTCIR-10 INTENT-2 task (the second INTENT task), which comprises the Subtopic Mining and the Document Ranking subtasks. INTENT-2 attracted participating teams from China, France, Japan and South Korea -- 12 teams for Subtopic Mining and 4 teams for Document Ranking (including an organisers' team). The Subtopic Mining subtask received 34 English runs, 23 Chinese runs and 14 Japanese runs; the Document Ranking subtask received 12 Chinese runs and 8 Japanese runs. We describe the subtasks, data and evaluation methods, and then report on the official results, as well as the revised results for Subtopic Mining.
THUIR at NTCIR-10 INTENT-2 Task
Yufei Xue, Fei Chen, Aymeric Damien, Cheng Luo, Xin Li, Shuai Huo, Min Zhang, Yiqun Liu and Shaoping Ma
[Pdf] [Table of Content]

This paper describes our approaches and results in NTCIR-10 INTENT-2 task. In this year, we participate in subtasks for both the Chinese and English topics. We extract subtopics from multiple resources for these topics, and several subtopic clustering and re-ranking methods are proposed in this work. In Document Ranking subtask, we redene the novelty of a document and use the new denition to re-rank the retrieved documents. Based on the existing diversication methods, we also try to selectively diversify the search results for the given queries, according to the query types determined by our strategies.
Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task
Junjun Wang, Guoyu Tang, Yunqing Xia, Qiang Zhou, Fang Zheng, Qinan Hu, Sen Na and Yaohai Huang
[Pdf] [Table of Content]

Understanding intent underlying search query recently attracted enormous research interests. Two challenging issues are worth noting: First, words within query are usually ambiguous while query in most cases is too short to disambiguate. Second, ambiguity in some cases cannot be resolved according merely to the limited query context. It is thus demanded that the ambiguity be resolved/analyzed within context other than the query itself. This paper presents the intent mining system developed by THCIB and THUIS, which is capable of understanding English and Chinese query respectively, with three four types of context: query, knowledge base, search results and user behavior statistics. The major contributions are summarized as follows: (1) Extracted from the query, concepts are used to extend the query; (2) Concepts are used to extract explicit subtopic candidates within Wikipedia. (3) LDA is applied to discover explicit subtopic candidates within search results. (4) Sense based subtopic clustering and entity analysis are conducted to cluster the subtopic candidates so as to discover the exclusive intents. (5) Intents are ranked with a unified intent ranking model. Experimental results indicate that our intent mining method is effective.
HULTECH at the NTCIR-10 INTENT-2 Task: Discovering User Intents through Search Results Clustering
Jose G. Moreno and Gaël Dias
[Pdf] [Table of Content]

In this paper, we describe our participation in the Subtopic Mining subtasks of the NTCIR-10 Intent-2 task, for the English language. For this subtask, we experiment a state-of-the-art algorithm for search results clustering, the HISGK-means algorithm and define the users' intents based on the cluster labels following a general framework. From the Web snippets returned for a given query, our framework allows the discovery of users' intents without (1) using the query logs databases provided by the organizers and (2) accessing any external knowledge base. Our best run outperforms the other competitors' submissions in terms of D-nDCG@10 and achieves high position in the general ranking.
The KLE's Subtopic Mining System for the NTCIR-10 INTENT-2 Task
Se-Jong Kim and Jong-Hyeok Lee
[Pdf] [Table of Content]

This paper describes our subtopic mining system for the NTCIR-10 INTENT-2 task. We propose a method that mines subtopics using simple patterns and the hierarchical structure of candidate strings based on the clusters of relevant documents using the provided web documents and official query suggestions. We extracted various candidate strings using simple patterns based on new query-types and POS tags. We constructed the hierarchical structure of the candidate strings according to the proposed process, and ranked them as subtopics using the document coverage and frequency information for each group of the candidate strings in the area satisfying the diversity requirement of the hierarchical structure.
Microsoft Research Asia at the NTCIR-10 Intent Task
Kosetsu Tsukuda, Zhicheng Dou and Tetsuya Sakai
[Pdf] [Table of Content]

Microsoft Research Asia participated in the Subtopic Mining subtask and Document Ranking subtask of the NTCIR-10 INTENT Task. In the Subtopic Mining subtask, we mine subtopics from query suggestions, clickthrough data and top results of the queries, and rank them based on their importance for the given query. In the Document Ranking subtask, we diversify top search results by estimating the intent types of the mined subtopics and combining multiple search engine results. Experimental results show that our best Japanese subtopic mining run is ranked No. 2 of all 14 runs in terms of D#-nDCG@10. All of our Japanese document ranking runs outperform the baseline ranking without diversification.
TUTA1 at the NTCIR-10 Intent Task
Haitao Yu and Fuji Ren
[Pdf] [Table of Content]

In NTCIR-10, we participated in the subtask of Subtopic Mining. We classify test topics into two types: role-explicit topic and role-implicit topic. According to the topic type, we devise different approaches to perform subtopic mining. Specifically, for role-explicit topics, we propose an approach of modifier graph based subtopic mining. The key idea is that: The modifier graph corresponding to a role-explicit topic is decomposable into clusters with strong intra-cluster interaction and relatively weak inter-cluster interaction. Each modifier cluster intuitively reveals a possible subtopic. For role-implicit topics that generally express single information needs, we directly generate the ranked list through semantic similarities leveraging on lexical ontologies. The evaluation results show that our best Chinese subtopic mining run gets the first position among all the runs in terms of D#-nDCG. However, our English subtopic mining runs show a poor performance, which is planned to be further improved in our future work.
LIA at the NTCIR-10 INTENT Task
Romain Deveaud and Eric SanJuan
[Pdf] [Table of Content]

This paper describes the participation of the LIA team in the English Subtopic Mining subtask of the NTCIR-10 INTENT-2 Task. The goal of this task was to specialize or disambiguate web search queries by identifying the different subtopics that could refer to these queries. Our motivation was to take a conceptual approach, therefore representing the query by a set of concepts before identifying the related subtopics. However we seem to have misunderstood the real point of this task, which was in fact focused on generating web query suggestion: official results therefore do not show support for our initial motivation.
KECIR at the NTCIR-10 INTENT Task
Cheng Guo, Yu Bai, Jianxi Zheng and Dongfeng Cai
[Pdf] [Table of Content]

This paper describes the approaches and results of our system for the NTCIR-10 INTENT task. We present some methods for Subtopic Mining subtask and Document Ranking subtask. In the Subtopic Mining subtask, we employ a voting method to rank candidate subtopics and semantic resource HowNet was used to merge those candidate subtopics which may impact diversity. In the Document Ranking Subtask, we also employ a voting method based on the mined subtopics. In the Chinese subtopic mining, our best values of I-rec@10, D-nDCG@10 and D#-nDCG@10 were separately 0.3743, 0.3965 and 0.3854. In the Document Ranking subtask, they were separately 0.6366, 0.3998 and 0.5182.
SEM12 at the NTCIR-10 INTENT-2 English Subtopic Mining Subtask
Md. Zia Ullah, Masaki Aono and Md. Hanif Seddiqui
[Pdf] [Table of Content]

Users express their information needs in terms of queries in search engines to find some relevant documents on the Internet. However, search queries are usually short, ambiguous and/or underspecified. To understand user's search intent, subtopic mining plays an important role and has attracted attention in the recent years. In this paper, we describe our approach to identifying, and then ranking user's intents for a query (or topic) from query logs, which is an english subtopic mining subtask of the NTCIR-10 Intent-2 task. We extract subtopics that are semantically and lexically related to the topic, and measure their weights based on co-occurrence of a subtopic across search engine query logs, and edit distance between a topic and a subtopic. These weighted subtopic strings are ranked to represent themselves as the candidates of subtopics (or intents). In the experiment section, we show the results of our method evaluated by the organizers. The best performance of our system achieves an I-rec@10 (Intent Recall) of 0.3780, a D-nDCG@10 of 0.4250, and a D#-nDCG@10 of 0.4014.
ICRCS at Intent2: Applying Rough Set and Semantic Relevance for Subtopic Mining
Xiao-Qiang Zhou, Yong-Shuai Hou, Xiao-Long Wang, Bo Yuan and Yao-Yun Zhang
[Pdf] [Table of Content]

The target of the subtopic mining subtask of NTCIR-10 Intent-2 Task is to return a ranked list of subtopics. To this end, this paper proposes a method to apply the rough set theory for redundancy reduction in subtopic mined from webpages. Besides, semantic similarity is used for subtopic relevance measure in the re-ranking process, computed with semantic features extracted by NLP tools and semantic dictionary. By using the reduction concept of rough set, we first construct rough set based model (RSBM) for subtopic mining. Next, we combine the rough set theory and semantic relevance into a new model (RS&SRM). Evaluation results show the effectiveness of our approach compared with a baseline frequency term based model (FTBM). The best performance is achieved by RS&SRM, with I-rec of 0.4046, D-nDCG of 0.4413 and D#-nDCG of 0.4229 on the subtask of Chinese subtopic mining.
Overview of the NTCIR-10 1CLICK-2 Task
Makoto P. Kato, Matthew Ekstrand-Abueg, Virgil Pavlu, Tetsuya Sakai, Takehiro Yamamoto and Mayu Iwata
[Pdf] [Table of Content]

This is an overview of the NTCIR-10 1CLICK-2 task (the second One Click Access task). Given a search query, 1CLICK aims to satisfy the user with a single textual output instead of a ranked list of URLs. Systems are expected to present important pieces of information first and to minimize the amount of text the user has to read. We designed English and Japanese 1CLICK tasks, in which 10 research teams (including two organizers' teams) participated and submitted 59 runs for Main tasks and a Query Classification subtask. We describe the tasks, test collection, and evaluation methods, and then report official results for NTCIR-10 1CLICK-2.
TTOKU Summarization Based Systems at NTCIR-10 1CLICK-2 task
Hajime Morita, Ryohei Sasano, Hiroya Takamura and Manabu Okumura
[Pdf] [Table of Content]

We describe our query-oriented summarization system implemented for the NTCIR-10 1CLICK-2 task. Our system is purely based on a summarization method regarding the task as a summarization process. The system calculates relevant scores of terms for a given query, then extracts relevant part of sentences from input sources. The calculation of relevant scores for a query, we employed Query SnowBall based method. In addition, the summarization is conducted as subtree extraction from dependency trees corresponding to the sentences in the input sources.
MSRA at NTCIR-10 1CLICK-2
Kazuya Narita, Tetsuya Sakai, Zhicheng Dou and Young-In Song
[Pdf] [Table of Content]

We describe Microsoft Research Asia's approaches to the NTCIR-10 1CLICK-2 task. We construct the system based on some heuristic rules, and change the setting of our approaches to test the effectiveness of each setting.
An API-based Search System for One Click Access to Information
Dan Ionita, Niek Tax and Djoerd Hiemstra
[Pdf] [Table of Content]

This paper proposes a prototype One Click access sys- tem, based on previous work in the eld and the related 1CLICK@NTCIR competition. The proposed solution in- tegrates methods from previous such attempts into a three tier algorithm: query categorization, information extraction and output generation and oers suggestions on how each of these can be implemented. Finally, a thorough user-based evaluation concludes that such an information retrieval sys- tem outperforms the textual preview collected from Google search results, based on a paired sign test. Based on vali- dation results possible suggestions on future improvements are proposed.
Hunter Gatherer: UdeM at 1CLICK-2
Pablo Duboue, Jing He and Jian-Yun Nie
[Pdf] [Table of Content]

We describe our hunter-gartherer system for the NTCIR-10 1CLICK-2 task. We inspire ourselves on the DeepQA framework looking to adapt it for the 1Click task. Several techniques can be integrated nat- urally in this framework. The hunter component generates candidates based on the passage retrieval for the original query, the gartherer com- ponent collects evidence for each candidate and score them based on the collected evidence, and finally a summarization system is utilized to organized the high-quality candidates. The evaluation results show the effectiveness of this framework in the 1Click search task.
XML Element Retrieval@1CLICK-2
Atsushi Keyaki, Jun Miyazaki, Kenji Hatano, Goshiro Yamamoto, Takafumi Taketomi and Hirokazu Kato
[Pdf] [Table of Content]

In this paper, we try to apply an approach of XML element retrieval to 1CLICK-2. The aim of XML element retrieval is to identify key points in structured documents, and propose them to system users. We believe that the aim of XML element retrieval is largely the same as that of 1CLICK-2; enabling direct and immediate Information Access (DIIA). We report potentiality of usefulness of XML element retrieval to DIIA, and some difficulties to apply XML element retrieval to Web documents.
Information Extraction based Approach for the NTCIR-10 1CLICK-2 Task
Tomohiro Manabe, Kosetsu Tsukuda, Kazutoshi Umemoto, Yoshiyuki Shoji, Makoto P. Kato, Takehiro Yamamoto, Meng Zhao, Soungwoong Yoon, Hiroaki Ohshima and Katsumi Tanaka
[Pdf] [Table of Content]

We describe a framework incorporating several information extraction methods for the NTCIR-10 One Click Access Task. Our framework first classifies a given query into pre-defined query classes, and then extracts sentences and key-value pairs which seem relevant to the query. At this time, our framework adjusts weights of the extraction methods depending on the judged class of the query. Finally, our framework summarizes ranked sentences and key-value pairs considering diversity.
Query Classification System Based on Snippet Summary Similarities for NTCIR-10 1CLICK-2 Task
Tatsuya Tojima and Takashi Yukawa
[Pdf] [Table of Content]

A query classification system for NTCIR-10 1CLICK-2 is described in this paper. The system classifies queries in Japanese and English into eight predefined classes by using support vector machines (SVMs) for classification. Feature vectors are created based on snippet similarities instead of snippet word frequency. These vectors, which have fewer dimensions than those made from raw words, reduce the number of parameters of SVMs. Therefore, the system achieves more generalization and reduces computing resources. Two methods for calculating document similarity, cosine similarity and Jaccard index, were compared. Additionally, two snippet sources, Bing search results given by the task organizer and Yahoo! Japan Web search results, were compared. Other methods that add query string information to snippet information for the feature vectors were compared with the above methods. Our system achieved 0.89 accuracy in the English task by cosine similarity and the Yahoo! Japan Web search results, and 0.86 in the Japanese task by cosine similarity and the Bing search results.
Query Classification by Using Named Entity Recognition Systems and Clue Keywords
Masaharu Yoshioka
[Pdf] [Table of Content]

Query classification is a subtask of 1CLICK task for selecting appropriate strategy to generate output text. In this paper, we propose to use named entity recognition tools and clue keywords (occupation name list and location type name list) to identify query types.
Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop
Isao Goto, Ka Po Chow, Bin Lu, Eiichiro Sumita and Benjamin K. Tsou
[Pdf] [Table of Content]

This paper gives an overview of the Patent Machine Translation Task (PatentMT) at NTCIR-10 by describing its evaluation methods, test collections, and evaluation results.We organized three patent machine translation subtasks: Chinese to English, Japanese to English, and English to Japanese. For these subtasks, we provided large-scale test collections, including training data, development data and test data. In total, 21 research groups participated and 144 runs were submitted. We performed four types of evaluations: Intrinsic Evaluation (IE), Patent Examination Evaluation (PEE), Chronological Evaluation (ChE), and Multilingual Evaluation (ME). We conducted human evaluations for IE and PEE.
BBN's Systems for the Chinese-English Sub-task of the NTCIR-10 PatentMT Evaluation
Zhongqiang Huang, Jacob Devlin and Spyros Matsoukas
[Pdf] [Table of Content]

This paper describes the systems we developed at Raytheon BBN Technologies for the Chinese-English sub-task of the Patent Machine Translation Task (PatentMT) of the NTCIR- 10 workshop. Our systems were originally built for trans- lating newswire articles and were subsequently adapted to address some special problems of patent documents in the NTCIR-9 PatentMT evaluation. We applied some of our re- cent advancements in translation to the patent domain and investigated a sentence-level language model adaptation ap- proach to take advantage of the characteristics of patent doc- uments. These approaches contributed substantially to the improvement of translation quality and our systems achieved the best results among all submissions across all of the eval- uation types and evaluation metrics.
NTT-NII Statistical Machine Translation for NTCIR-10 PatentMT
Katsuhito Sudoh, Jun Suzuki, Hajime Tsukada, Masaaki Nagata, Sho Hoshino and Yusuke Miyao
[Pdf] [Table of Content]

This paper describes details of the NTT-NII system in NTCIR-10 PatentMT task. The system is an extension of the NTT-UT system in NTCIR-9 by: a new English dependency parser (for EJ task), a syntactic rule-based pre-ordering (for JE task), a syntax-based post-ordering (for JE task). Our system ranked 1st in EJ subtask both in automatic and subjective evaluation, and was the best SMT system in JE subtask.
The System Combination RWTH Aachen: SYSTRAN for the NTCIR-10 PatentMT Evaluation
Minwei Feng, Markus Freitag, Hermann Ney, Bianka Buschbeck, Jean Senellart and Jin Yang
[Pdf] [Table of Content]

This paper describes the joint submission by RWTH Aachen University and SYSTRAN in the Chinese-English Patent Machine Translation Task at the 10th NTCIR Workshop. We specifies the statistical systems developed by RWTH Aachen University and the hybrid machine translation systems developed by SYSTRAN. We apply RWTH Aachen's combination techniques to create consensus hypotheses from very different systems: phrase-based and hierarchical SMT, rule-based MT (RBMT) and MT with statistical post-editing (SPE). The system combination was ranked second in BLEU and second in the human adequacy evaluation in this competition.
The RWTH Aachen System for NTCIR-10 PatentMT
Minwei Feng, Christoph Schmidt, Joern Wuebker, Markus Freitag and Hermann Ney
[Pdf] [Table of Content]

This paper describes the statistical machine translation (SMT) systems developed by RWTH Aachen University for the Patent Translation task of the 10th NTCIR Workshop. Both phrase-based and hierarchical SMT systems were trained for the Japanese-English and Chinese-English tasks. Experiments were conducted to compare standard and inverse direction decoding, the performance of several additional models and adding monolingual training data. Further, for the Chinese-English subtask we applied a system combination technique to create a consensus hypothesis from several different systems.
SRI's Submissions to Chinese-English PatentMT NTCIR10 Evaluation
Bing Zhao, Jing Zheng, Wen Wang and Nicolas Scheffer
[Pdf] [Table of Content]

The SRI team joined the subtask of Chinese-English Patent machine translation evaluation, and submitted the translation results using a combined output from two types of grammars supported in SRInterp, with two different word segmentations. We investigated the effect of adding sparse features, together with several optimization strategies. Also,for the PatentMT domain, we carried out preliminary experiments on adapting language models. Our results showed positive improvements using these approaches.
The HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10
Patrick Simianer, Gesa Stupperich, Laura Jehl, Katharina Wäschle, Artem Sokolov and Stefan Riezler
[Pdf] [Table of Content]

We describe the statistical machine translation (SMT) systems developed at Heidelberg University for the Chinese-to-English and Japanese-to-English PatentMT tasks at the NTCIR-10 workshop. The core system used in both tasks is a combination of hierarchical phrase-based translation and discriminative training using large feature sets and l1/l2 regularization. Our goal is to address the twofols nature of patents by deploying the repetitive nature of patents through feature sharing in a multi-task learning setup (used in the Japanese-to-English translation task), and by countersteering complex word order differences with syntactic features (used in the Chinese-to-English translation task).
FUN-NRC: Paraphrase-augmented Phrase-based SMT Systems for NTCIR-10 PatentMT
Atsushi Fujita and Marine Carpuat
[Pdf] [Table of Content]

This paper describes FUN-NRC group's machine translation systems that participated in the NTCIR-10 PatentMT task. The central motivation of this participation was to clarify the potential of automatically compiled collections of sub-sentential paraphrases. Our systems were built using our baseline phrase-based SMT system by augmenting its phrase table with novel translation pairs generated by combining paraphrases with translation pairs learned directly from the training bilingual data. We investigated two methods for phrase table augmentation: source-side augmentation and target-side augmentation. Among the systems we submitted, the two that worked best were (a) the one that paraphrased only unseen phrases into translatable phrases at the source side and (b) the one that paraphrased target phrases only into phrases that were seen in the original phrase table. Both these systems were trained on not only bilingual, but also monolingual data. The other two systems were trained using only bilingual data. This paper also reports on our follow-up experiments focusing on the relationship between reordering restriction and system performance.
Machine Translation System for Patent Documents Combining Rule-based Translation and Statistical Postediting Applied to the NTCIR-10 PatentMT Task
Terumasa Ehara
[Pdf] [Table of Content]

In this article, we describe system architecture, preparation of training data and discussion on experimental results of the EIWA group in the NTCIR-10 Patent Translation Task. Our system is combining rule-based machine translation and statistical post-editing. The new thing of our system compared with NTCIR-9 PatentMT task is to implement automatic selecting method from multiple translations that are rule-based MT output and statistical post-editing output. JE subtask results show that this method is effective.
The TRGTK's System Description of the PatentMT Task at the NTCIR-10 Workshop
Hao Xiong and Weihua Luo
[Pdf] [Table of Content]

This paper introduces the TRGTK's system for Patent Machine Translation at the NTCIR-10 Workshop. In this year's program, we participate Chinese-English, English-Japanese and Japanese-English three subtasks. We submit required system results for Intrinsic Evaluation (IE), Patent Examination Evaluation (PEE), Chronological Evaluation (ChE), and Multilingual Evaluation (ME). Different from last year's strategy, we focus on developing a strong and practical system for large-scale machine translation requirements. We design parallel algorithm for Chinese word segmentation, weights tuning and translation decoding, especially we propose a documental level translation method to improve the translation quality of special terms. Experimental results show that our system reduce the training and decoding time while still achieve promising translation results.
ISTIC Statistical Machine Translation System for PatentMT in NTCIR-10
Yanqing He, Chongde Shi and Huilin Wang
[Pdf] [Table of Content]

This paper describes statistical machine translation system of ISTIC used in the evaluation campaign of the patent machine translation task at NTCIR-10. In this year's evaluation, we participated in patent machine translation tasks for Chinese-EnglishEJapanese-English and English-Japanese. Here we mainly describe the overview of the system, the primary modules, the key techniques and the evaluation results.
OkaPU's Japanese-to-English Translator for NTCIR-10 PatentMT
Hideki Isozaki
[Pdf] [Table of Content]

This paper describes Okayama Prefectural University's system for NTCIR-10 PatentMT JE task. It is a variant of the REV method proposed by Katz-Brown and Collins [KBC08] which obtained the best human evaluation score among Statistical Machine Translation systems at NTCIR-7 [FUYU08]. Their REV method preorders Japanese sentences without syntactic parsing. They split each Japanese sentence into segments at punctuations and the Japanese topic marker "wa". Then, they reversed words in each segment and concatenated the reversed segments into one. For NTCIR-10, we tried to improve the REV method by keeping Japanese word order in each base noun phrase, because English base noun phrase also follows head-nal word order just like Japanese.
Pattern-Based Statistical Machine Translation for NTCIR-10 PatentMT
Jin'Ichi Murakami, Isamu Fujiwara and Masato Tokuhisa
[Pdf] [Table of Content]

Pattern-based machine translation is a very traditional machine translation method that uses translation patterns and translation word (phrase) dictionaries. The characteristic of this translation method is that high-quality translation results can be obtained if the input sentence matches the translation pattern and this translation pattern is correct. However, translation patterns and translation word dictionaries are usually made manually. Therefore, there are many costs in making a pattern-based machine translation system. We propose making translation patterns and translation word dictionaries automatically by using statistical machine translation methods. Using these methods, we decreased the costs in making a pattern-based machine translation system. We demonstrate the effectiveness of the proposed method in a Japanese-English machine translation patent task (NTCIR-10). We obtained good results.
Description of KYOTO EBMT System in PatentMT at NTCIR-10
Toshiaki Nakazawa and Sadao Kurohashi
[Pdf] [Table of Content]

This paper describes "KYOTO" EBMT system that attended PatentMT at NTCIR-10. When translating very different language pairs such as Japanese-English, it is very important to handle sentences in tree structures to overcome the difference. Many of recent studies incorporate tree structures in some parts of translation process, but not all the way from model training (parallel sentence alignment) to decoding. "KYOTO" system is a fully tree-based translation system where we use the treelet alignment model by bilingual generation and monolingual derivation on dependency trees, and example-based translation.
Use of the Japio Technical Field Dictionaries and Commercial Rule-based Engine for NTCIR-PatentMT
Tadaaki Oshio, Tomoharu Mitsuhashi and Tsuyoshi Kakita
[Pdf] [Table of Content]

Japio performs various patent-related translation businesses, and owns the original patent-document-derived bilingual technical term database (Japio Terminology Database) to be used by the translators. Currently the database contains more than 1,900,000 J-E bilingual technical terms. The Japio Technical Field Dictionaries (technical-field-oriented machine translation dictionaries) are created from the Japio Terminology Database based on each entry's frequency in the bilingual patent document corpora which are also compiled by Japio. Japio applied the Japio Technical Field Dictionaries to a commercial machine translation engine for the NTCIR9-PatentMT (JE and EJ subtasks).
UQAM's System Description for the NTCIR-10 Japanese and English PatentMT Evaluation Tasks
Fatiha Sadat and Fu Zhe
[Pdf] [Table of Content]

This paper describes the development of a Japanese-English and English-Japanese translation system for the NTCIR-10 Patent MT tasks. The MT system is based on the provided training data and Moses decoder. We report our first attempt on statistical machine translation for these pairs of languages and the Patent domain.
Using Parallel Corpora to Automatically Generate Training Data for Chinese Segmenters in NTCIR PatentMT Tasks
Jui-Ping Wang and Chao-Lin Liu
[Pdf] [Table of Content]

Chinese texts do not contain spaces as word separators like English and many alphabetic languages. To use Moses to train translation models, we must segment Chinese texts into sequences of Chinese words. Increasingly more software tools for Chinese segmentation are populated on the Internet in recent years. However, some of these tools were trained with general texts, so might not handle domain-specific terms in patent documents very well. Some machine-learning based tools require us to provide segmented Chinese to train segmentation models. In both cases, providing segmented Chinese texts to refine a pre-trained model or to create a new model for segmentation is an important basis for successful Chinese-English machine translation systems. Ideally, high-quality segmented texts should be created and verified by domain experts, but doing so would be quite costly. We explored an approach to algorithmically generate segmented texts with parallel texts and lexical resources. Our scores in NTCIR-10 PatentMT indeed improved from our scores in NTCIR-9 PatentMT with the new approach.
System Description of BJTU-NLP MT for NTCIR-10 PatentMT
Peihao Wu, Jinan Xu, Yue Yin and Yujie Zhang
[Pdf] [Table of Content]

This paper presents the overview of statistical machine translation systems and example-based machine translation system that BJTU-NLP developed for the NTCIR-10 Patent Machine Translation Task (NTCIR-10 PatentMT). We used Japanese named entity in Japanese word segmentation and found a good result is obtained in EJ subtask. Although we use external chemical dictionary in our Patent SMT of Chinese to English, it didn't make a better BLEU score in our experiments.
An Improved Patent Machine Translation System Using Adaptive Enhancement for NTCIR-10 PatentMT Task
Hai Zhao, Jingyi Zhang, Masao Utiyama and Eiichro Sumita
[Pdf] [Table of Content]

This paper describes the work that we conducted for the Chinese-English (CE) task of the NTCIR-10 patent machine translation evaluation. We built standard phrase-based and hierarchical phrase-based statistical machine translation (SMT) systems with optimized word segmentation, adaptive language model and improved parameter tuning strategy. Our systems outperform ocial baselines by approximate 2 BLEU points.
TSUKU Statistical Machine Translation System for the NTCIR-10 PatentMT Task
Zhongyuan Zhu, Jun-ya Norimatsu, Toru Tanaka, Takashi Inui and Mikio Yamamoto
[Pdf] [Table of Content]

This paper describes details of the TSUKU machine trans- lation system in the NTCIR-10 PatentMT task [8] . This system is an implementation of our tree-to-string statisti- cal machine translation model that combines a context-free grammar (CFG) parse tree and a dependency parse tree.
ZZX_MT: the BeiHang MT System for NTCIR-10 PatentMT Task
Wenhan Chao and Zhoujun Li
[Pdf] [Table of Content]

In this paper, we describe ZZX_MT machine translation system for the NTCIR-10 Patent Machine Translation Task(PatentMT). We participated in the Chinese-English translation subtask and submit four results, which correspond to four different subtasks respectively. ZZX_MT is a SMT system, which integrating the BTG constraint into reordering models.
Overview of the Recognizing Inference in Text (RITE-2) at NTCIR-10
Yotaro Watanabe, Yusuke Miyao, Junta Mizuno, Tomohide Shibata, Hiroshi Kanayama, Cheng-Wei Lee, Chuan-Jie Lin, Shuming Shi, Teruko Mitamura, Noriko Kando, Hideki Shima and Kohichi Takeda
[Pdf] [Table of Content]

This paper describes an overview of RITE-2 (Recognizing Inference in TExt) task in NTCIR-10. We evaluated systems that automatically recognize semantic relations between sentences such as paraphrase, entailment, contradiction in Japanese, Simplified Chinese and Traditional Chinese. The tasks in RITE-2 are Binary Classification of entailment (BC Subtask), Multi-Class Classification including paraphrase and contradiction (MC Subtask), Entrance Exam Subtasks (Exam BC and Exam Search), Unit Test, and RITE4QA Subtask. We had 28 active participants, and received 215 formal runs (110 Japanese runs, 53 Traditional Chinese runs, 52 Simplified Chinese runs). This paper also describes how the datasets for RITE-2 had been developed, how the systems were evaluated, and reports RITE-2 formal run results.
Construction of a Simple Inference System of Textural Similarity (oka1 RITE2)
Hirotaka Morita and Koichi Takeuchi
[Pdf] [Table of Content]

The motivation to join the RITE-2 task is to understand what kinds of linguistic and common knowledge are needed to recognize textual similarity in RITE-2 data. After a shallow manual analysis of RITE-2-bc example data, we found that morpheme similarity would be a major factor to recognize textual similarity. Thus we construct a threshold based inference system with Japanese WordNet as a base system that can be extended to more deep linguistic knowledge in further development. The preliminary experimental results show that the simple noun similarity based system outperformed the WordNet based system; but in the formal run, the WordNet based system gives the highest score in f-measure among the other simple system we constructed.
Binary-class and Multi-class based Textual Entailment System
Partha Pakray, Sivaji Bandyopadhyay and Alexander Gelbukh
[Pdf] [Table of Content]

The article presents the experiments car-ried out as part of the participation in Recognizing Inference in TExt (RITE-2) @NTCIR-10 for Japanese. RITE-2 has four subtasks Binary-class (BC) subtask for Japanese and Chinese, Multi-class (MC) subtask for Japanese and Chinese, Entrance Exam for Japanese and RITE4QA for Chinese. We have submitted three runs in BC subtask for Japanese (JA) (one run), Chinese Simplified (CS) (one run) and Chinese Traditional (CT) (one run). Three runs have been submitted in MC Subtask, one run for each language. We have developed Textual Entailment system which is based on Machine Translation using the web based Google translator system . The system is based on the Support Vector Machine that uses fea-tures from lexical similarity, lexical dis-tance, and syntactic similarity.
JAIST Participation at NTCIR-10 RITE-2
Minh Quang Nhat Pham, Minh Le Nguyen and Akira Shimazu
[Pdf] [Table of Content]

Textual entailment recognition is a fundamental problem in natural language understanding. The task is to determine whether the meaning of one text can be inferred from the meaning of the other one. At NTCIR-10 RITE-2 this year - our second participation in this challenge, we use the modified version of our RTE system used at NTCIR-9 RITE for four subtasks for Japanese: BC, MC, ExamBC, and Unit Test. In the feature aspect, we remove features which do not have benefits on development set of each subtask and add some new features. In the machine learning aspect, we employ the Bagging method - a robust ensemble learning method. We conduct extra experiments to evaluate the effects of features and the Bagging method on the accuracy of the RTE system.
KitAi: Textual Entailment Recognition System for NTCIR-10 RITE2
Kazutaka Shimada, Yasuto Seto, Mai Omura and Kohei Kurihara
[Pdf] [Table of Content]

This paper describes Japanese textual entailment recognition systems for NTCIR-10 RITE2. The tasks that we participated in are the Japanese BC subtask and the ExamBC subtask. Our methods are based on some machine learning techniques with surface level, syntax and semantic features. We use two ontologies, the Japanese WordNet and Nihongo-Goi-Taikei, and Hierarchical Directed Acyclic Graph (HDAG) structure as the syntax and semantic information. For the ExamBC task, the confidence value from a classifier is important to judge the correctness as the entrance exams. To predict a suitable confidence value, we apply a weighting method of each output from several classifiers. In formal runs, the best accuracy rates in the methods for the BC and the ExamBC tasks were 77.11 points and 59.84 on the macro F1 measure, respectively. Although the method based on SVMs was better than others in terms of the macro F1 measure, the weighted scoring method produced the best performance for the correct answer ratio (45.4).
IASL RITE System at NTCIR-10
Cheng-Wei Shih, Chad Liu, Cheng-Wei Lee and Wen-Lian Hsu
[Pdf] [Table of Content]

At our second participation in NTCIR RITE, we developed a two-stage knowledge-based textual inference recognition system for both BC and MC subtasks in Chinese. Two main recognition systems, which are based on named entities, Chinese tokens, word dependency, and sentence length, were implemented to identify the entailment and contradiction between sentences. The evaluation result showed that our 2-stage system achieved 0.6714 and 0.4632 for traditional Chinese BC and MC subtasks respectively. It greatly surpassed our previous work in NTCIR-9 RITE. In the unofficial run of simplified Chinese, the accuracy of our system also reached 0.6045 in BC and 0.5094 in MC.
NCCU-MIG at NTCIR-10: Using Lexical, Syntactic, and Semantic Features for the RITE Tasks
Wei-Jie Huang and Chao-Lin Liu
[Pdf] [Table of Content]

We computed linguistic information at the lexical, syntactic, and semantic levels for the RITE (Recognizing Inference in TExt) tasks for both traditional and simplified Chinese in NTCIR-10. We employed techniques for syntactic parsing, named-entity recognition, and near synonym recognition, and considered features like counts of common words, sentence lengths, negation words, and antonyms to judge the logical relationships of two sentences. Both heuristics-based functions and machine-learning approaches were explored. We focused on the BC (binary classification) task at the preparatory stage, but participated in both BC and MC (multiple classes) evaluations. Three settings were submitted for the formal runs for each task. The best performing settings achieved the second best performance in BC tasks, and were listed in the top five performers in MC tasks for both traditional and simplified Chinese.
Team SKL's Strategy and Experience in RITE2
Shohei Hattori and Satoshi Sato
[Pdf] [Table of Content]

This paper describes the strategies and systems of the team SKL in the RITE2 workshop. We implemented three different systems. SKL-01 was designed by two-step classification strategy. Step1 assigns a default class to a given text pair by applying a simple rule based on an overlap measure; Step2 examines the necessity of overwriting the default class by applying heuristic rules. In contrast, SKL-02/03 were designed by a different strategy, which focused on the development of character-based features. SKL-02 is a SVM-based system using these features, and SKL-03 works by hand-tuned decision rules. In the MC subtask of the formal run, our all three systems ranked the top three.
CYUT Chinese Textual Entailment Recognition System for NTCIR-10 RITE-2
Shih-Hung Wu, Shan-Shan Yang, Liang-Pu Chen, Hung-Sheng Chiu and Ren-Dar Yang
[Pdf] [Table of Content]

Textual Entailment (TE) is a critical issue in natural language processing (NLP). In this paper we report our approach to the Chinese textual entailment and the system result on NTCIR-10 RITE-2 both simplified and traditional Chinese dataset. Our approach is based on more observation on training data and finding more types of linguistic features. The approach is a complement to the traditional machine learning approach, which treat every pair in a standard process. In the official runs, we tested three types of entailment features, i.e. the usage of negative words, time expression, and numbers. The experimental result is promising; we find this extensible approach can include more types.
WSD Team's Approaches for Textual Entailment Recognition at the NTCIR10 (RITE2)
Daiki Ito, Masahiro Tanaka and Hayato Yamana
[Pdf] [Table of Content]

In this paper, we describe WSD team's approaches for the textual entailment recognition task (RITE) at NTCIR-10 with experimental results for three Japanese subtasks; Binary Class (BC), Multi Class (MC) and Entrance Exam BC (ExamBC). Our approaches are mainly based on supervised learning techniques; Support Vector Machine (SVM) and Logistic Regression (LR). For binary classification subtasks (BC and ExamBC), we proposed some hand-coded rules to classify text into entailment or non-entailment. For multi-classification subtask (MC), we used the features of texts in both directions. The best performance in three runs achieved a precision of 80.66% in BC subtask, 69.53% in MC subtask and 67.86% in ExamBC subtask. We won the second place both in BC and MC, and the third place in ExamBC.
Extracting Features for Machine Learning in NTCIR-10 RITE Task
Lun-Wei Ku, Edward T.-H. Chu and Nai-Hsuan Han
[Pdf] [Table of Content]

NTCIR-9 RITE task evaluates systems which automatically detect entailment, paraphrase, and contradiction in texts. The SinicaNLP team developed a preliminary system for the NTCIR-9 RITE task based on rules. In NTCIR-10, we tried machine learning approaches. We transformed the existing rules into features and then add additional syntactic and semantic features for SVM. The straightforward assumption was still kept in NTCIR-10: the relation between two sentences was determined by the different parts between them instead of the identical parts. Therefore, features in NTCIR-9 including sentence lengths, the content of matched keywords, quantities of matched keywords, and their parts of speech together with new features including parsing tree information, dependency relations, negation words and synonyms were considered. We found that some features were useful for the BC subtask while some help more in the MC subtask.
IMTKU Textual Entailment System for Recognizing Inference in Text at NTCIR-10 RITE2
Min-Yuh Day, Chun Tu, Shih-Jhen Huang, Hou-Cheng Vong and Sih-Wei Wu
[Pdf] [Table of Content]

In this paper, we describe the IMTKU (Information Management at TamKang University) textual entailment system for recognizing inference in text at NTCIR-10 RITE2 (Recognizing Inference in Text). We proposed a textual entailment system using a hybrid approach that integrate semantic features and machine learning techniques for recognizing inference in text at NTCIR-10 RITE2 task. We submitted 3 official runs for BC, MC and RITE4QA subtask. In NTCIR-10 RITE2 task, IMTKU team achieved 0.509 in the CT-MC subtask, 0.663 in the CT-BC subtask; 0.402 in the CS-MC subtask, 0.627 in the CS-BC subtask; In MRR index, 0.257 in the CT-RITE4QA subtask, 0.338 in the CS-RITE4QA subtask.
KC99: A Prediction System for Chinese Textual Entailment Relation Using Decision Tree
Tao-Hsing Chang, Yao-Chi Hsu, Chung-Wei Chang, Yao-Chuan Hsu and Jen-I Chang
[Pdf] [Table of Content]

The aim of the current study is to propose a system, which can automatically deduce entailment relations of textual pairs. The system mainly uses seven features and a decision tree is utilized as a prediction model of the system and seven features of textual pairs are employed to be input of the prediction model. The experimental results for dataset Formal-run based on our proposed method are evaluated by NTCIR. In CT-BC task, Macro-F1 of the proposed method is 57.67%. In CT-MC task, Macro-F1 is 43.73%.
BCMI-NLP Labeled-Alignment-Based Entailment System for NTCIR-10 RITE-2 Task
Xiao-Lin Wang, Hai Zhao and Bao-Liang Lu
[Pdf] [Table of Content]

In this paper, we propose a labeled-alignment-based RTE method to approach the simplified Chinese textual entailment track in the NTCIR-10 RITE-2 task. The labeled alignment, compared with the normal alignment, employs negative links to explicitly mark the contradictory expressions between the two sentences to justify the non-entailment pairs. Therefore, the corresponding alignment-based RTE method can gain accuracy improvement through actively detecting the signals of non-entailment. The performance of the proposed method in the formal run achieves Macro-F1's of 73.84%, 56.82%, and Worse Ranking (R) of 8.00% for the simplified Chinese subtasks of Binary Class (BC), Multi-Class (MC) and RITE for Question Answering (RITE4QA), respectively.
WUST at NTCIR-10 RITE-2 Task: Multiple Feature Approach to Chinese Textual Entailment
Maofu Liu, Yue Wang, Yan Li and Huijun Hu
[Pdf] [Table of Content]

This paper describes our work in NTCIR-10 on RITE-2 Binary-class (BC) subtask and Multi-class (MC) subtask in Simplified Chinese. We construct the classification model based on support vector machine to recognize semantic inference in Chinese text pair, including entailment and non-entailment for BC subtask and forward entailment, bidirectional entailment, contradiction and independence for MC subtask. In our system, we use multiple features including statistical feature, lexical feature and syntactic feature.
Expanded Dependency Structure based Textual Entailment Recognition System of NTTDATA for NTCIR10-RITE2
Megumi Ohki, Takashi Suenaga, Daisuke Satoh, Yuji Nomura and Toru Takaki
[Pdf] [Table of Content]

This paper describes NTT DATA's recognizing textual entailment (RTE) systems for NTCIR10 RITE2. We participate in four Japanese tasks, BC Subtask, Unit Test, Exam BC and Exam Search. Our approach uses a ratio with the same semantic relations between words. It is necessary to recognize two semantic viewpoints, which are the semantic relation and the meaning between words in a sentence, in order to recognize textual entailment. We divide the methods into the semantic dependency relation between words and the meaning between words for recognizing textual entailment. In this paper, we present our system using methods for recognizing semantic relations using expanded dependency structures.
The Description of the NTOU RITE System in NTCIR-10
Chuan-Jie Lin and Yu-Cheng Tu
[Pdf] [Table of Content]

The textual entailment system determines whether one sentence entails another in common sense. This is the second time of a RITE task in NTCIR projects. Three different subtasks, BC, MC, and RITE4QA, were held this time. We proposed several new features, and tried to construct RITE systems by using binary- or multi-class classifiers. After correcting errors in our submitted runs, our best (unofficial) system in the BC subtask achieves 65.12% in macro F-measure and 66.52% in accuracy. The performance of our MC classifiers is around 44.8% in macro F-measure and 56.64% in accuracy. Our best (unofficial) system in the RITE4QA subtask achieves 32.67% in Top1 accuracy, 41.74% in MRR, and 56% in Top 5 accuracy regarding to the WorseRanking.
Local Graph Matching with Active Learning for Recognizing Inference in Text at NTCIR-10
Tsuyoshi Okita
[Pdf] [Table of Content]

This paper describes the textual entailment system developped at Dublin City University for the participation of the textual entailment task in NTCIR-10. Our system is a local graph matching-based system with active learning: we explore reducing the unknown words and unknown named entities, incorporating meaning in parentheses / rhetorical expressions / semantic roles, and employing text understanding technique using simple logic. We deploy an additional feature of language model from deep learning. Our result was 80.49 for macro F1 score, 84.95 for precision for the positive entailment, and 79.95 for recall for negative entailment.
EHIME Textual Entailment System Using Markov Logic in NTCIR-10 RITE-2
Yuji Takesue and Takashi Ninomiya
[Pdf] [Table of Content]

This paper reports the methods used by the EHIME team for textual entailment recognition in NTCIR-10, RITE-2. We participated in the Japanese BC subtask and Japanese MC subtask. We used Markov logic to infer textual entailment relations. In our Markov logic network, words and hyponyms are used as features.
Detecting Contradiction in Text by Using Lexical Mismatch and Structural Similarity
Daniel Andrade, Masaaki Tsuchida, Takashi Onishi and Kai Ishikawa
[Pdf] [Table of Content]

Detecting contradicting statements is a crucial sub-problem for ltering textual entailment pairs. Given two text pairs T1 and T2 that are topically related, the task is to find whether the two statements are in contrast to each other, or not. In many situations a mismatch in named entities or numerical expressions is a strong clue for a contradiction between T1 and T2. However, if the dependency parse trees are quite dierent, and the words of T1 and T2 cannot be aligned well, then this indicates that a lexical mismatch is not sucient to conclude contradiction of T1 and T2. We present a new method that assumes the higher the structural similarity of two sentences, the higher the chance that a contradiction on the word level indicates contradiction on the sentence level. We participated in two subtasks of RITE2 at NTCIR-10 which contain many contradicting statements in a real world setting. Our system became second place in the ExamBC subtask (formal run, ocial result), and first place in the ExamSearch subtask (formal run, unocial result). We show that our proposed method contributed to the improvement in both tasks.
FLL: Local Alignments based Approach for NTCIR-10 RITE-2
Takuya Makino, Seiji Okajima, and Tomoya Iwakura
[Pdf] [Table of Content]

This paper describes the textual entailment system of FLL for RITE-2 task in NTCIR-10. Our system is based on the set of local alignments conducted on different linguistic unit levels, such as word, Japanese base phrase, numerical expression, Named Entity, and sentence. Our system uses features obtained from local alignments' results. We applied our system to Japanese BC task and Japanese MC task at formal run, and Japanese UnitTest task at unofficial run. The performance of our system for the BC and the MC outperformed baseline, and the result of the UnitTest achieved the best performance.
BnO at NTCIR-10 RITE: A Strong Shallow Approach and an Inference-based Textual Entailment Recognition System
Ran Tian, Yusuke Miyao, Takuya Matsuzaki and Hiroyoshi Komatsu
[Pdf] [Table of Content]

The BnO team participated in the Recognizing Inference in TExt (RITE) subtask of the NTCIR-10 Workshop [5]. This paper describes our textual entailment recognition system with experimental results for the five Japanes subtasks: BC, MC, EXAM-BC, EXAM-SEARCH, and UnitTest. Our appoach includes a shallow method based on word overlap features and named entity recognition; and a novel inference-based approach utilizing an inference engine to explore relations among algebraic forms of sets, which are computed from a tree representation similar to the depency-based compositional semantics.
THK's Natural Logic-based Compositional Textual Entailment Model at NTCIR-10 RITE-2
Yotaro Watanabe, Junta Mizuno and Kentaro Inui
[Pdf] [Table of Content]

This paper describes the THK system that participated in the BC subtask, MC subtask, ExamBC subtask and UnitTest in NTCIR-10 RITE-2. Our system learns plausible transformations of pairs of text t_1 and hypothesis t_2 only from semantic labels of the pairs using a discriminative probabilistic model combined with the framework of Natural Logic. The model is trained so as to prefer alignments and their semantic relations which infer the correct sentence-level semantic relations. In the formal run, we achieved the highest performance of detecting contradictions in the MC subtask (28.57 of F1).
Predicate-argument Structure based Textual Entailment Recognition System of KYOTO Team for NTCIR-10 RITE-2
Tomohide Shibata, Sadao Kurohashi, Shotaro Kohama, Akihiro Yamamoto
[Pdf] [Table of Content]

We participated in Japanese tasks of RITE-2 in NTCIR-10 (team id: “KYOTO E. Our proposed method regards predicate-argument structure as a basic unit of handling the meaning of text/hypothesis, and performs the matching between text and hypothesis. Our system first performs predicate-argument structure analysis to both a text and a hypothesis. Then, we perform the matching between text and hypothesis. In matching text and hypothesis, widecoverage relations between words/phrases such as synonym and is-a are utilized, which are automatically acquired from a dictionary, Web corpus and Wikipedia.
IBM Team at NTCIR-10 RITE2: Textual Entailment Using Temporal Dimension Reduction
Masaki Ohno, Yuta Tsuboi, Hiroshi Kanayama and Katsumasa Yoshikawa
[Pdf] [Table of Content]

Our system for the Japanese BC/EXAM subtasks in NTCIR-10 RITE2 is an extension of our previous system for NTCIR-9 RITE. The new techniques are (1) Case-aware noun phrase matching using ontologies: The motivation of the feature is to capture finer syntactic structures than simple word matching. We uses ontologies to allow flexible matching of noun phrases. (2) Temporal expression matching after mapping historical entities to specific time intervals: The motivation of historical entity mapping is to expand the capabilities of the temporal expression matching. From the experimental results, we found that the coverage is more important than the accuracy in the temporal entity mapping. The scores of the formal runs were 74.9% (accuracy in BC) and 64.5% (accuracy in EXAM), which outperformed the baselines provided by the organizer.
TKDDI group at NTCIR10-RITE2: Recognizing Texual Entailment Based on Dependency Structure Alignment
Junta Mizuno, Kentaro Inui, Asuka Sumida, Gen Hattori and Chihiro Ono
[Pdf] [Table of Content]

This paper discribes the TKDDI system which participated in NTCIR10-RITE2. We propose a method for recognizing textual entailment by not using only word alignment, but also using syntactic dependency structure alignment. Entailment can then be recognized by the overlap of the dependency structures. Our system achieved a macro f1 of 63.83 on JA-BC, 49.08 on JA-ExamBC and 74.00 on JA-UnitTest.
The WHUTE System in NTCIR-10 RITE Task
Han Ren, Hongmiao Wu, Chen Lv, Donghong Ji and Jing Wan
[Pdf] [Table of Content]

This paper describes our system of recognizing textual entailment for RITE Traditional and Simplified Chinese subtasks at NTCIR-10. We build a textual entailment recognition framework and implement a system that employs features of three categories, including string, structure and linguistic features, for the recognition. In addition, an entailment transformation approach is leveraged to align text fragments in each pair. We also utilize a cascaded recognition strategy, which first judge entailment or no entailment, and then forward, bidirectional, contradiction or independence relation of each text pair in turn. Official results show that our system achieves a 65.55% MacroF1 performance in Traditional BC subtask, a 45.50% in Traditional MC subtask, a 61.65% in Simplified BC subtask and a 46.79% in Simplified MC subtask. In IR4QA subtasks, our system achieves a 27.33% WorseRanking Top1 accuracy in Traditional subtask and a 18.67% in Simplified subtask.
Combining Multiple Lexical Resources for Chinese Textual Entailment Recognition
Yu-Chieh Wu, Yue-Shi Lee and Jie-Chi Yang
[Pdf] [Table of Content]

Identifying Textual Entailment is the task of finding the relationship between the given hypothesis and text fragments. Developing a high-performance text paraphrasing system usually requires rich external knowledge such as syntactic parsing, thesaurus which is limited in Chinese since the Chinese word segmentation problem should be resolve first. By following last year, in this year, we continue adopting the created RITE system and combine with multiple online available thesaurus. We derive two exclusive feature sets for learners. One is the operations between the text pairs, while the other adopted the traditional bag-of-words model. Finally, we train the classifier with the above features. The official results indicate the effectiveness of our method.
Overview of the NTCIR-10 SpokenDoc-2 Task
Tomoyosi Akiba, Hiromitsu Nishizaki, Kiyoaki Aikawa, Xinhui Hu, Yoshiaki Itoh, Tatsuya Kawahara, Seiichi Nakagawa, Hiroaki Nanjo and Yoichi Yamashita
[Pdf] [Table of Content]

This paper describes an overview of the IR for Spoken Docu- ments Task in NTCIR-10 Workshop. In this task, the spoken term detection (STD) subtask and ad-hoc spoken content retrieval subtask (SCR) are conducted. Both of the tasks target to search terms, passages and documents included in academic oral presentations. This paper explains the data used in the tasks, how to make transcriptions by speech recognition and the details of each tasks.
Using Multiple Speech Recognition Results to Enhance STD with Suffix Array on the NTCIR-10 SpokenDoc-2 Task
Kouichi Katsurada, Koudai Katsuura, Kheang Seng, Yurie Iribe and Tsuneo Nitta
[Pdf] [Table of Content]

We have previously proposed a fast spoken term detection method that uses a suffix array as a data structure. By applying dynamic time warping on a suffix array, we achieved very quick keyword detection from a very large-scale speech document. In this study, we modify our method so that it can deal with multiple recognition results. By using these results obtained from various speech recognizers, search performance will improve as a consequence of the complementary effect of using different language and acoustic models. Experimental results show the maximum value of F-measure and the MAP score increased by 6% to 10%.
An STD System for OOV Query Terms Integrating Multiple STD Results of Various Subword units
Kazuma Kon'no, Hiroyuki Saito, Shirou Narumi, Kenta Sugawara, Kesuke Kamata, Manabu Kon'no, Jinki Takahashi and Yoshiaki Itoh
[Pdf] [Table of Content]

We have been proposing a Spoken Term Detection (STD) method for Out-Of-Vocabulary (OOV) query terms integrating various subword recognition results using monophone, triphone, demiphone, one third phone, and Sub-phonetic segment (SPS) models. In the proposed method, subword-based ASR (Automatic Speech Recognition) is performed for all spoken documents and subword recognition results are generated using subword acoustic models and subword language models. When a query term is given, the subword sequence of the query term is searched for all subword sequences of subword recognition results of spoken documents. Here, we use acoustical distances between subwords when matching the two subword sequences by Continuous Dynamic Programming. We have also proposed the method re-scoring and integrating multiple STD results obtained using various subword units. Each candidate segment has a distance, the segment number and the document number. Re-scoring is performed using distances each of high ranked candidate segments, and the last distance is obtained by integrating then linearly using weighting factors. In STD tasks (SDPWS) of IR for Spoken Documents in NTCIR-10, we apply various subword models to the STD tasks and integrate multiple STD results obtained from these subword models.
Spoken Content Retrieval Using Distance Combination and Spoken Term Detection Using Hash Function for NTCIR10 SpokenDoc2 Task
Satoru Tsuge, Ken Ichikawa, Norihide Kitaoka, Kazuya Takeda and Kenji Kita
[Pdf] [Table of Content]

In this paper, we describe a spoken content retrieval (SCR) method and a spoken term detection (STD) method which are participated in 2nd round of IR for Spoken Documents (SpokenDoc-2) task. The SCR method maps the target documents into multiple vector spaces, which are the word-based vector space constructed by the word-based speech recognition results and the syllable-based vector space constructed by the syllable-based speech recognition results, by a vector space model. On the syllable-based space, a latent semantic indexing (LSI) is applied to the document vectors. We apply a query expansion and a morphemes weighting on the word-based space. Finally, the distance between the query and the document on each vector space are combined for retrieving the documents. On the other hand, in the STD method, the sub-sequences extracted from target document are converted into the bit sequences by using the hash function. A query is also converted into a bit sequence by the same way. The candidates are detected by calculating hamming distance between the bit sequence of the query and that of target documents. Then, the STD method calculates the distances between the query and the candidates by DP matching. For evaluating the proposed methods, we conducted spoken document retrieval experiments on the SpokenDoc task at NTCIR-9 meeting. Using these experimental results for deciding the parameters, we submitted the results to the SpokenDoc2 task at the NTCIR-10.
DCU at NTCIR-10 SpokenDoc2 Passage Retrieval Task
Maria Eskevich and Gareth J. F. Jones
[Pdf] [Table of Content]

We describe details of our runs and the results obtained for the "2nd round of IR for Spoken Documents (SpokenDoc2)" task. The focus of the subtask that we participated in is on passage retrieval from the Corpus of Spoken Document Processing Workshop (SDPWS). Previously we investigated the use of different content-based segmentation methods that provide topically coherent units for retrieval. For NTCIR-10 we compared content-based segmentation (TextTiling) to fixed length segmentation into a fixed number of Inter-Pausal Units (IPUs) using a sliding window, and further combintation of overlapping segments into one unit in the ranked list of results. Another focus of our submissions to NTCIR-10 is the potential for use external data for document expansion. For this we used a DbPedia collection for IPU expansion for all segmentation methods.
Spoken Document Retrieval Using Extended Query Model and Web Documents
Kiichi Hasegawa, Masanori Takehara, Satoshi Tamura and Satoru Hayamizu
[Pdf] [Table of Content]

This paper proposes a novel approach for spoken document retrieval. In our method, a query model which is one of the probabilistic language models is adopted, in order to computes a probability to generate a given query from each document. We employ not only a 'static' document collection consisting of targeted documents but also a 'dynamic' document collection including web documents related with queries. We expand the query model so as to incorporate probabilities obtained from 'static' and 'dynamic' language models using the Dirichlet smoothing. Furthermore, in order to improve retrieved results, we develop a weighting method for web documents. Experiments using NTCIR-9 SpokenDoc Dry-run and NTCIR-10 SpokenDoc-2 Formal-run were conducted, and it is found our proposed scheme has enough performance compared with conventional methods.
Spoken Document Retrieval Experiments for SpokenDoc-2 at Ryukoku University (RYSDT)
Hiroaki Nanjo, Tomohiro Nishio and Takehiko Yoshimi
[Pdf] [Table of Content]

In this paper, we describe spoken document retrieval sys- tems in Ryukoku University, which were participated in NTCIR-10 IR for Spoken Documents (SpokenDoc-2) task. In NTCIR-10 SpokenDoc-2 task, there are two subtasks: spoken term detection (STD) subtask and ad-hoc spo- ken content retrieval (SCR) subtask. We participated in the SCR subtask as team RYSDT. In this paper, our SCR systems are described.
DTW-Distance-Ordered Spoken Term Detection and STD-based Spoken Content Retrieval: Experiments at NTCIR-10 SpokenDoc-2
Tomoyosi Akiba, Tomoko Takigami, Teppei Ohno and Kenta Kase
[Pdf] [Table of Content]

In this paper, we report our experiments at NTCIR-10 SpokenDoc-2 task. We participated both the STD and SCR subtasks of SpokenDoc. For STD subtask, we applied novel indexing method, called metric subspace indexing, previously proposed by us. One of the distinctive advantages of the method was that it could output the detection results in increasing order of distance without using any predefined threshold for the distance. In this work, we extend the algorithm to work with the Dynamic Time Warping (DTW) distance. We also participated the newly introduced iSTD task by using the proposed STD method. For SCR subtask, two kinds of approaches were applied both the lecture and passage level. The first approach used STD for detecting the terms in the query from the spoken documents and their results were used to caluculate the similarities between the query and documents according to the vector space model. The second approach used IR models based on language modeling and mainly focused on determining the optimal range of a relevant passage as the retrieval result.
STD and SCR Techniques and Their Evaluations on the NTCIR-10 SpokenDoc-2 Task
Yuto Furuya, Daiki Nakagomi, Satoshi Natori, Hiromitsu Nishizaki and Yoshihiro Sekiguchi
[Pdf] [Table of Content]

This paper describes spoken term detection (STD) and spo- ken contents retrieval (SCR) techniques and their evalua- tions at the NTCIR-10 SpokenDoc-2 task. First of all, we describes our STD technique using control using a phoneme transition network (PTN) derived from multiple speech rec- ognizers' outputs and its evaluations at the STD and iSTD (inexistent STD) sub-tasks. Next, we introduce our SCR technique using Web documents for expanding the target spoken documents. It is evaluated on the two SCR sub- tasks.
Spoken Document Retrieval by Contents Complement and Keyword Expansion Using Subordinate Concept for NTCIR-SpokenDoc
Noboru Kanedera
[Pdf] [Table of Content]

In this paper, we report our experiments at NTCIR-10 IR for Spoken Documents (SpokenDoc) task. We participated SDR subtask of SpokenDoc. The keyword expansion using the subordinate concept and dictionary improved the mean average precision (MAP) from 0.320 to 0.324 for the lecture retrieval task. For the passage retrieval task, all of contents complement, keyword expansion, and subword were used. The subword was effective because a retrieving keyword was not contained in target in many cases. Moreover, it was found that a beginning subtopic is useful as topic information in the contents complement.
YLAB@RU at Spoken Term Detection Task in NTCIR-10 SpokenDoc-2
Iori Sakamoto, Kook Cho, Masanori Morise and Yoichi Yamashita
[Pdf] [Table of Content]

The development of spoken term detection (STD) techniques, which detect a given word or phrase from spoken documents, is widely conducted in order to realize easy access to large amount of multimedia contents including speech This paper describes improvement of the STD method which is based on the vector quantization (VQ) and has been proposed in NTCIR-9 SpokenDoc. Spoken documents are represented as sequences of VQ codes, and they are matched with a text query to be detected based on the V-P score which measures the relationship between a VQ code and a phoneme. The matching score between VQ codes and phonemes is calculated after normalization for each phoneme in a query term to avoid biased scoring to particular phonemes. The score normalization improves the STD performance by 6% of F measure.
Spoken Term Detection by N-gram Index with Exact Distance for NTCIR-SpokenDoc2
Nagisa Sakamoto and Seiichi Nakagawa
[Pdf] [Table of Content]

For spoken term detection, it is very important to consider Out-of-Vocabulary (OOV). Therefore, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a very fast Japanese spoken term detection system that is robust for considering OOV words. We used individual syllables as sub-word unit in continuous speech recognition and an n-gram index of syllables in a recognized syllable-based lattice. We propose an n-gram indexing/retrieval method in the syllable lattice for attacking OOV, and high speed retrieval. Specially, in this paper, we applied our method to SDPWSspeech and reported the evaluation results.
Spoken Term Detection Using Distance-Vector based Dissimilarity Measures and Its Evaluation on the NTCIR-10 SpokenDoc-2 Task
Naoki Yamamoto and Atsuhiko Kai
[Pdf] [Table of Content]

In recent years, demands for distributing or searching multimedia contents are rapidly increasing and more effective method for multimedia information retrieval is desirable. In the studies on spoken document retrieval systems, much research has been presented focusing on the task of spoken term detection (STD), which locates a given search term in a large set of spoken documents. One of the most popular approaches performs indexing based on the sub-word sequence which is converted from the recognition hypotheses from LVCSR decoder for considering recognition errors and OOV problems. In this paper, we propose acoustic dissimilarity measures for improved STD performance. The proposed measures are based on a feature sequence of distance-vector representation, which consists of all the distances between two possible combinations of distributions in a set of sub-word unit HMMs and represents a structural feature. The experimental results showed that our two-pass STD system with new acoustic dissimilarity measure improve the performance compared to the STD system with a conventional acoustic measure.
NTCIR-10 Math Pilot Task Overview
Akiko Aizawa, Michael Kohlhase and Iadh Ounis
[Pdf] [Table of Content]

This paper presents an overview of a new pilot task, the NTCIR Math Task, which is speci cally dedicated to information access to mathematical content. In particular, the paper summarizes the subtasks addressed at the NTCIR Math Task as well as the main approaches deployed by the participating groups.
The Abject Failure of Keyword IR for Mathematics Search: Berkeley at NTCIR-10 Math
Ray R. Larson, Chloe J. Reynolds and Fredric C. Gey
[Pdf] [Table of Content]

This paper demonstrates that classical content search using individual keywords is inadequate for mathematical formulae search. For the NTCIR10 Math Pilot Task, the authors used a standard indexing by content word for search coupled with search for components of mathematical formulae. This was followed by formula extraction from the top ranked documents. Performance was terrible, even for partial relevance. The further inclusion of some manual reformulation of topics into queries did not improve retrieval performance.
Querying Large Collections of Mathematical Publications: NTCIR10 Math Task
Moritz Schubotz, Marcus Leich and Volker Markl
[Pdf] [Table of Content]

In this paper, we present our approach for searching mathematical formulae. We focus on a batch query approach that does not rely on specialized indexes, which are usually domain dependent and restrict the expressiveness of the query language. Instead, we use Stratosphere, a distributed data processing platform for Big Data Analytics that accesses data in a non-indexed format. This system is very e ective for answering batches of queries that a researcher may wish to evaluate in bulk on large data sets. We demonstrate our approach using the NTCIR10 Math task, which provides a set of formula patterns and a test data corpus. We showcase a simple data analysis program for answering the given queries. We interpret the patterns as regular expressions and assume that matches to these expressions are also relevant search results to the end-user. Based on the evaluation of our results by mathematicians from Zentralblatt Math and mathematics students from Jacobs University, we conclude that our assumption holds principally with regard to precision and recall. Our work is just a rst step towards a well-dened query language and processing system for scientic publications that allows researchers to specify their information need in terms of mathematical formulae and their contexts. We envision that our system can be utilized to realize such a vision.
MathWebSearch at NTCIR-10
Michael Kohlhase and Corneliu Prodescu
[Pdf] [Table of Content]

We present and analyze the results of the MATHWEBSEARCH system in the NTCIR-10 Math pilot task, a challenge in mathematical information retrieval. MATHWEBSEARCH is a content-based search engine that focuses on fast query answering for interactive applications. It is currently restricted to exact formula search, i.e. no similarity search and no full-text search. As the MATHWEBSEARCH system has been described elsewhere, we will only present new achievements, evaluate the results, and detail future work suggested by the task results.
The MCAT Math Retrieval System for NTCIR-10 Math Track
Goran Topić, Giovanni Yoko Kristianto, Minh-Quoc Nghiem and Akiko Aizawa
[Pdf] [Table of Content]

NTCIR Math Track targets mathematical content access based on both natural language text and mathematical formulae. This paper describes the participation of MCAT group in the NTCIR math retrieval subtask and math understanding subtask. We introduce our mathematical search system that is capable of formula search, and full-text search. We also introduce our mathematical description extraction system which was based on a support vector machine model. Experimental results show that our generalpurpose search engine can work reasonably well with math queries.
Similarity Search for Mathematics: Masaryk University Team at the NTCIR-10 Math Task
Martin Líška, Petr Sojka and Michal Růžička
[Pdf] [Table of Content]

This paper describes and summarizes experiences of Masaryk University team MIRMU with the mathematical search performed for the NTCIR pilot Math Task. Our approach is the similarity search based on enhanced full text search utilizing attested state-of-the-art techniques and implementations. The variability of used Math Indexer and Searcher (MIaS) system in terms of the math query notation was tested by submitting multiple runs with four query notations provided. The analysis of the evaluation results shows that the system performs best using TEX queries that are translated to combined Presentation-Content MathML.
Partial-match Retrieval with Structure-reflected Indices at the NTCIR-10 Math Task
Hiroya Hagino and Hiroaki Saito
[Pdf] [Table of Content]

To attain fast and accurate response in math formulae search, an index should be prepared which holds structure information of math expressions; a different indexing for full text search. Although some previous research has been done by this approach, the size of indices tends to become huge on memory. This paper proposes a partial match retrieval system for math formulae with two kinds of indices. The first one is an inverted index constructed from paths to the root node from each node seeing formula as an expression tree. The other index is a table which stores the parent node and the text string for each node in the expression trees. A hundred thousand documents in the NTCIR-10 Math Task (formula search) containing 36 million math formulae were used for evaluation. The number of nodes was about 291 million and the number of path kinds in the inverted index was about 9 million. Experimental results showed that the search time grows linearly to the number of retrieved documents. Concretely, the search time ranges from 10 milliseconds to 1.2 seconds; the simpler formulae tend to need more search time.
Overview of the NTCIR-10 MedNLP Task
Mizuki Morita, Yoshinobu Kano, Tomoko Ohkuma, Mai Miyabe and Eiji Aramaki
[Pdf] [Table of Content]

Recently, medical records are increasingly written on electronic media instead of on paper, thereby increasing the importance of information processing in medical fields. We have organized an NTCIR-10 pilot task for medical records. Our pilot task, MedNLP, comprises three tasks: (1) de-identification, (2) complaint and diagnosis, and (3) free. These tasks represent elemental technologies used to develop computational systems supporting widely diverse medical services. Development has yielded 22 systems for task (1), 15 systems for task (2), and 1 system for task (3). This report presents results of these systems, with discussion clarifying the issues to be resolved in medical NLP fields.
Improvement Recall of NTCIR-MedNLP Using Hierarchical Bayesian Language Models
Ryo Fujii and Masashi Tada
[Pdf] [Table of Content]

The msiknowledge team participated in the de-identification, complaint and diagnosis subtasks of the NTCIR-10 MedNLP Pilot Task. The terms of complaint and diagnosis subtasks may have several representations and some terms may be not in sample data, and also locations and personal names in de-identification task are unique in each cases, so simple dictionary building from sample can't solve the problem. In this research, to avoid using outer copora or dictio- naries which are not general solution, we aimed at building a system that learns to recognize terminology from small tagged corpus by combining a tag level language model and a character level language model base on HPYLM.
Clinical Entity Recognition Using Cost-Sensitive Structured Perceptron for NTCIR-10 MedNLP
Shohei Higashiyama, Kazuhiro Seki and Kuniaki Uehara
[Pdf] [Table of Content]

This paper reports on our approach to the NTCIR-10 MedNLP task, which aims at identifying personal and medical infor- mation in Japanese clinical texts. We applied a machine learning (ML) algorithm for sequential labeling, specifically, structured perceptron, and defined a cost function for low- ering misclassification cost. On the test set provided by the organizers, our approach achieved an F-score of 77.00 for the de-identification task and 79.14 for the complaint and diagnosis task.
NTCIR-10 MedNLP Task Baseline System
Hiroto Imachi, Mizuki Morita and Eiji Aramaki
[Pdf] [Table of Content]

Natural language processing (NLP) technology that handles clinical, medical and health records has been drawn much attention, because such kinds of records potentially could be rich clinical resources. This paper describes an NLP system that extracts two kinds of information from clinical docu- ments in Japanese, which was developed as a baseline sys- tem in the NTCIR-10 MedNLP Pilot Task. Since our system consists of only open source tools and resources, it can be freely used by anyone. The experimental results showed rea- sonably good performances in both of two subtasks in the NTCIR-10 MedNLP Pilot Task; (1) de-identification task (precision: 86.10%, recall:74.54%, F-measure 79.9%) and (2) complaint and diagnosis task (precision: 87.37%, recall: 71.86%, F-measure 78.86%). These results have demon- strated the basic feasibility of our simple system.
HCRL at NTCIR-10 MedNLP Task
Osamu Imaichi, Toshihiko Yanase and Yoshiki Niwa
[Pdf] [Table of Content]

This year's MedNLP[1] has two tasks: de-identification and complaint and diagnosis. We tested both machine learning based methods and an ad-hoc rule-based method for the two tasks. For the de-identification task, the rule-based method got slightly higher results, while for the complaint and diagnosis task, the machine learning based method had much higher recalls and overall scores. These results suggest that these methods should be applied selectively depending on the nature of the information to be extracted, that is to say, whether it can be simply patternized or not.
Finding Specific Medical Terms Using the Life Science Dictionary for MedNLP
Shuji Kaneko, Nobuyuki Fujita and Hiroshi Ohtake
[Pdf] [Table of Content]

We have been developing an English-Japanese thesaurus of medical terms for the past 20 years. The thesaurus is compatible with MeSH (Medical Subject Headings, developed by the National Library of Medicine, USA) and contains approximately 30,000 headings with 200,000 synonyms (consisting of the names of anatomical concepts, biological organisms, chemical compounds, methods, diseases and symptoms). In this study, we aimed to extract as many medical terms as possible from the test data by using a simple longest-matching Perl script. After changing a given UTF-8 text to EUC format, the matching process required only 2 minutes including the loading of a 10 MB dictionary into a memory space with a desktop computer (Apple Mac Pro). From the 0.1 MB test document, 2,569 terms (including English spellings) were tagged and displayed in a color HTML format. In the case of names of diseases and symptoms, it was found that 893 terms had a number of formal errors and omissions. Furthermore, the matching process was found to have certain limitations in matching ambiguous abbreviations and misspelled words. In spite of this drawback, however, the simple longest- matching strategy may prove to be useful in the preprocessing of medical reports.
An Trial Report to NTCIR10 MedNLP: Extracting Medical Diagnostic Term
Kota Kanno, Kazuyoshi Osanai and Kyoji Umemura
[Pdf] [Table of Content]

This paper explains our approach toward NTCIR10-MEDNLP[1] tasks and what kind of problem we have encountered. We have select term extraction tasks since we have some experience about keyword extraction[2]. Since it is hard to build accurate dictionary or lexicon for medical term, we aimed to use machine learning and large amount of roughly tagged medical corpus as learning data. However, we are unable to prepare the learning data, and thus unable to make our system work.
Identifying Symptoms and Diseases in MedNLP Japanese Materials Using Chinese Resources
Lun-Wei Ku, Edward T.-H. Chu, Cheng-Wei Sun and Wan-Lun Li
[Pdf] [Table of Content]

In this paper, we describe the Sinica-Yuntech system (TeamID: SinicaNLP) at the NTCIR-10 MedNLP task. Materials of the MedNLP task are in Japanese. However, having only Chinese resources and knowledge, we need to translate these materials into Chinese. Two preprocessing approaches, different in the timing of translation, were taken. One was to translate Japanese sentences in to Chinese ones, and then to perform segmentation and part of speech tagging on these Chinese sentences; the other was to segment and tag parts of speech on Japanese sentences, and then to translate the composite words. After knowing words and their parts of speech, we identified symptoms and diseases by a vocabulary matching approach. The Internet searching results and parts of speech patterns were also utilized to recognize out of vocabulary symptoms. After recognizing the targets in Chinese, a reverse translation was performed in order to label the original Japanese materials. We merged the tags from vocabulary matching, Internet searching and pattern mapping to obtain the performance of our best run: an f-score 53.88 and an accuracy 91.46.
NECLA at the Medical Natural Language Processing Pilot Task (MedNLP)
Pierre-François Laquerre and Christopher Malon
[Pdf] [Table of Content]

This paper gives an overview of NECLA's submitted sys- tems for the De-Identification and Complaint & Diagnosis subtasks of the Medical Natural Language Processing Pi- lot Task (MedNLP)[5]. Our systems combine features de- rived from Part of Speech (POS) tags, a domain-specific dictionary, the Unified Medical Language System (UMLS) metathesaurus and semantic network, and a small set of heuristics based on trigger-words and polarity propagation through sentence dependency parse trees.
UT-FX at NTCIR-10 MedNLP: Incorporating Medical Knowledge to Enhance Medical Information Extraction
Yasuhide Miura, Tomoko Ohkuma, Hiroshi Masuichi, Emiko Yamada Shinohara, Eiji Aramaki and Kazuhiko Ohe
[Pdf] [Table of Content]

The UT-FX team participated in the de-identification sub- task and the complaint and diagnosis subtask of the NTCIR- 10 MedNLP pilot task. This report describes our approach to solving the two subtasks.
Medical Information Extracting System by Bootstrapping of NTTDRDH at NTCIR-10 MedNLP Task
Yuji Nomura, Takashi Suenaga, Daisuke Satoh, Megumi Ohki and Toru Takaki
[Pdf] [Table of Content]

We participated in a complaint and diagnosis task of MedNLP in NTCIR10. We extracted words of complaint/diagnosis by using a hybrid approach with bootstrapping and pattern matching with a medical term dictionary. It was possible that part of the complaint's or diagnosis's expressions are present in the extracted words. Therefore, our system con- catenated the extracted words and their surrounding words by heuristic rules and determined the final complaint's or diagnosis's words. And our system estimated the modality attribute of the extracted complaint/diagnosis by heuristic rules also.
Complaint and Diagnosis Extraction System Utilizing Rule-based Term Extraction System
Koichi Takeuchi, Shozaburo Minamoto and Motoki Yamasaki
[Pdf] [Table of Content]

N/A
A Simple Approach to NTCIR-10 MedNLP task
Yuka Tateisi and Takashi Okumura
[Pdf] [Table of Content]

For MedNLP complaint and diagnosis subtask we tried a simple dictionary-matching method using MeCab, and achieved 61.10% F-score in official evaluation. Our method can evalute the coverage of terminology data in a simple and inexpensive way.