NTCIR-9 Abstract

Preface from NTCIR-9 General Chairs
Noriko Kando, Tsuneaki Kato and Eiichiro Sumita
[Pdf] [Table of Contents]
Overview of NTCIR-9
Tetsuya Sakai and Hideo Joho
[Pdf] [Table of Contents]

This is an overview of NTCIR-9, the Ninth NTCIR Workshop. It touches upon a brief history of NTCIR, introduces the tasks run at NTCIR-9, and reports on some statistics across these tasks. For details on the individual tasks, we refer the reader to the task overview papers and the participants' papers.
Natural Language Understanding, Semantic-based Information Retrieval and Knowledge Management
Jun'ichi Tsujii
[Pdf] [Table of Contents]

Although shared test collections have been around for since the 1960s, the arrival of TREC in 1992 was a major boost to information retrieval research. It spawned a wide range of evaluation campaigns around the world including NTCIR, CLEF, FIRE, INEX, etc. However, in recent years, the number of papers at major conferences, such as SIGIR or CIKM, that report research using the collections produced by these campaigns appear to be reducing. More papers are describing research using so-called private data sets. So, what will the evaluation campaigns look like in 20 years from now? Is the rise of private data a threat to the scientific integrity of IR research, or a valuable diversification of research focus?
NTCIR9-GeoTime Overview - Evaluating Geographic and Temporal Search: Round 2
Fredric Gey, Ray R. Larson, Jorge Machado and Masaharu Yoshioka
[Pdf] [Table of Contents]

GeoTime for the NTCIR Workshop 9 is the second evaluation of Geographic and Temporal Information Retrieval called “NTCIR GeoTime”. The focus of this task is on search with Geographic and Temporal constraints. This overview describes the data collections (Japanese and English news stories), topic development, assessment results and lessons learned from this second NTCIR GeoTime task, which combines GIR with time-based search to find specific events in a multilingual collection. Six teams submitted Japanese runs and nine teams submitted English runs. Three teams participated in both Japanese and English.
The Use of Inference Network Models in NTCIR-9 GeoTime
Christopher G. Harris
[Pdf] [Table of Contents]

We describe our approach in identifying documents in a specific collection that are able to sufficiently answer a series of geo-temporal queries. This year, we submitted four runs for the NTCIR-9 GeoTime task, using the Indri search engine, on a collection of English-language newspaper articles. Our four submitted runs achieved nDCG scores ranging from 0.5275 to 0.6254 and MAP ranging from 0.4164 to 0.4990 across twenty-five separate geo-temporal queries.
Geo-temporal Information Retrieval Based on Semantic Role Labeling and Rank Aggregation
Yoonjae Jeong, Gwan Jang, Kyung-min Kim and Sung-Hyon Myaeng
[Pdf] [Table of Contents]

The purpose of this study is to improve effectiveness of geo-temporal information retrieval with semantic role labeling (SRL) for sentences in topics and documents, especially focusing on locational and temporal facets. We propose a combination of four language models (LM) representing different semantic roles and scopes of models for documents and a rank aggregation method. The rationale is based on observation that sentence-based language models using SRL retrieved relevant documents that are not ranked high by a general LM approach. Although we did not get the comparison result between the general model and our proposed method from NTCIR-9 minutely, we obtained meaningful improvement with the NTCIR-8 GeoTime corpus. Given that the current result is based on our initial effort under the time limitation, we believe that further exploration along the idea of using SRL would give a significant improvement in the geo-temporal information retrieval.
Modification of Vocabulary-based Re-ranking for Geographic and Temporal Searching at NTCIR GeoTime Task
Kazuaki Kishida and Ikuko Matsushita
[Pdf] [Table of Contents]

This paper reports on experiments in the NTCIR-9 GeoTime task performed by a research group at the School of Library and Information Science in Keio University (KOLIS), which tried to explore techniques for searching a Japanese document collection for requests on geographic and temporal information. A special component of re-ranking for enhancing performance of geographic and temporal searches was added to the KOLIS system, in which standard Okapi BM25 and probabilistic pseudo-relevance feedback (PRF) were implemented. That is, at the first stage, a list of documents relevant to a given topic was specified by standard IR techniques, and at the second stage, the list was re-ranked after scores of documents which included geographic and temporal terms were increased. In particular, each score of documents including a syntactic pattern “geographic or temporal term + で” was augmented for improving search performance where “で” is a functional word meaning “at” or “in”. In the old version of re-ranking used in the former NTCIR-8 workshop, only frequency of occurrence of geographic and temporal terms was taken into consideration. In this experiment of Japanese monolingual (JA-JA) retrieval and English to Japanese bilingual (EN-JA) retrieval, the search runs using jointly the re-ranking based on the syntactic pattern and PRF showed the highest performance. However, the ranking based on the syntactic pattern could not bring explicit improvement in comparison with the ranking technique at NTCIR-8.
Probabilistic Text Retrieval for NTCIR9 GeoTime
Ray R. Larson
[Pdf] [Table of Contents]

For the NTCIR-9 Workshop UC Berkeley participated only in the GeoTime track. For our initial experiments we used only the Logistic Regression ranking with blind feedback approach that we also used in NTCIR-8. We participated in both English and Japanese monolingual and bilingual search tasks. For all Japanese topics we preprocessed the text using the ChaSen morphological analyzer for term segmentation. For these submitted runs we did not do any special purpose geographic or temporal processing. This brief paper describes the submitted runs and the methods used for them.
Geo-Temporal retrieval filtering versus answer resolution using Wikipedia
Jorge Machado, José Borbinha and Bruno Martins
[Pdf] [Table of Contents]

We describe an evaluation experiment on GeoTemporal Document Retrieval created for the GeoTime evaluation task of NTCIR 2011. This work describes the retrieval techniques developed to accomplish this task. We describe the collections used in the workshop, detailing the composition of the collections in terms of geographic and temporal expressions. The first contribution of this work is the collections’ statistics, which by itself reveals the relevance of this subject. Our parsing techniques found millions of references related with the dimensions of relevance time and space. Those references were used to index the documents in order to score them in those dimensions. We also introduce a technique to find extra references in Wikipedia using Google Search Service and the same parsers used in the collections. Those references were used in four different scenarios depending on the queries: first we used the references found in topics to filter documents without geographic or temporal expressions and used pseudo relevance feedback to expand topics with no references using the indexes created for places and dates; in other approach we used the Wikipedia references to filter documents from the result set, in a last approach we expanded all topics with the Wikipedia references. Finally we used another technique based on metric distances calculated through coordinates (latitudes and longitudes) and dates in order to create a scope for documents and topics, and rank them according to the distance between each other.
SINAI at NTCIR-9 GeoTime: a filtering and reranking approach based solely on geographical entities
José M. Perea-Ortega, Miguel Ángel García-Cumbreras, Manuel García-Vega and L. Alfonso Ureña-López
[Pdf] [Table of Contents]

Geographic Information Retrieval (GIR) is an active and growing research area that focuses on the retrieval of textual documents according to a geographical criteria of relevance. In recent years, the IR research community has paid particular attention in IR systems that take into account temporal constraints. Temporal Information Retrieval (TIR) is a recent field which addresses the combination of usual IR techniques with new ones for addressing temporal dimension of relevance. The NTCIR-GeoTime task was created to evaluate IR systems that combine geographical and temporal constraints. In this work, we propose a filtering and reranking function for these type of systems based on the retrieval status value calculated by the IR engine and the geographical similarity between the document and the query. Due to we have only considered the geographical criteria, the obtained results show that the proposed function does not improve the baseline experiment applying solely an IR approach. Therefore, it is necessary to improve the proposed reranking function taking into account the temporal entities found in the document collection and topics.
University of Alicante at NTCIR-9 GeoTime
Fernando S. Peregrino, David Tomás and Fernando Llopis Pascual
[Pdf] [Table of Contents]

In this paper we present a complete system for the treatment of the geographical dimension in the text and its application to information retrieval. This system has been evaluated in the GeoTime task of the 9th NTCIR workshop, making it possible to compare the system with other current approaches to the topic. In order to participate in this task we have added to our GIR system the temporal dimension. The system proposed here has a modular architecture in order to add or modify features. In the development of this system we have followed a QA-based approach to improve the system performance.
NTCIR-9 GeoTime at Osaka Kyoiku University - Toward Automatic Extraction of Place/Time Terms -
Takashi Sato
[Pdf] [Table of Contents]

Our approach to NTCIR-9 Geotime was to obtain place/time information about topics from Wikipedia and Google using query terms extracted from topics. Adding this information to query terms, we retrieved documents using <TEXT> tag index and scored them. In addition, we compared <DATE> tag of searched documents with time information, weighted the score value of documents retrieved, and ranked them. Although the automation of extraction of place/time remains for future research, the validity of the method was confirmed from the comparison of evaluation results with runs which do not use these Place/Time information.
GeoTime Retrieval through Passage-based Learning to Rank
Xiao-Lin Wang, Hai Zhao and Bao-Liang Lu
[Pdf] [Table of Contents]

The NTCIR-9 GeoTime task is to retrieve documents to answer such questions as when and where certain events happened. In this paper we propose a Passage-Based Learning to Rank (PGLR) method to address this task. The proposed method recognizes texts both strongly related to the target topics and containing geographic and temporal expressions. The implemented system provides more accurate search results than a system without PGLR. Performance, according to the official evaluation, is average among submitted systems.
RMIT and Gunma University at NTCIR-9 GeoTime Task
Michiko Yasukawa, J. Shane Culpepper, Falk Scholer and Matthias Petri
[Pdf] [Revised Paper Pdf 20111222] [Table of Contents]

In this report, we describe our experimental approach for the NTCIR-9 GeoTime task. For our experiments, we use our experimental search engine, NeWT. NeWT is a ranked self-index capable of supporting multiple languages by deferring linguistic decisions until query time. To our knowledge, this is the first application of ranked self-indexing to a multilingual information retrieval task at NTCIR.
ABRIR at NTCIR-9 GeoTime Task Usage of Wikipedia and GeoNames for Handling Named Entity Information
Masaharu Yoshioka
[Pdf] [Table of Contents]

In the previous NTCIR8-GeoTime task, ABRIR (Appropriate Boolean query Reformulation for Information Retrieval) proved to be one of the most effective systems for retrieving documents with Geographic and Temporal constraints. However, failure analysis showed that the identification of named entities and relationships between these entities and the query is important to improving the quality of the system. In this paper, we propose to use Wikipedia and GeoNames as resources for extracting knowledge about named entities. We also modify our system to use such information.
Overview of the NTCIR-9 INTENT Task
Ruihua Song, Min Zhang, Tetsuya Sakai, Makoto P. Kato, Yiqun Liu, Miho Sugimoto, Qinglei Wang and Naoki Orii
[Pdf] [Table of Contents]

This is an overview of the NTCIR-9 INTENT task, which comprises the Subtopic Mining and the Document Ranking subtasks. The INTENT task attracted participating teams from seven different countries/regions ? 16 teams for Subtopic Mining and 8 teams for Document Ranking. The Subtopic Mining subtask received 42 Chinese runs and 14 Japanese runs; the Document Ranking subtask received 24 Chinese runs and 18 Japanese runs. We describe the subtasks, data and evaluation methods, and then report on the results.
ICTIR Subtopic Mining System at NTCIR-9 INTENT Task
Shuai Zhang, Kai Lu and Bin Wang
[Pdf] [Table of Contents]

This paper describes the approaches and results of our Chinese subtopic mining system for the NTCIR-9 INTENT task. In this system, we first find out the related queries from query logs, then group them into different clusters using a frequent term-set based clustering algorithm. Finally, the central query of each cluster is used to represent the subtopic of this cluster. Encyclopedia and commercial search engines are also used to enhance the mining effectiveness. The evaluation results of our runs show that our approaches perform well. Among the 5 runs we submit, ICTIR-S-C-1 is ranked within top five in terms of D#-nDCG for l=10, 20, 30 and outperforms others in terms of I-rec.
University of Glasgow at the NTCIR-9 Intent task: Experiments with Terrier on Subtopic Mining and Document Ranking
Rodrygo L. T. Santos, Craig Macdonald and Iadh Ounis
[Pdf] [Table of Contents]

We describe our participation in the subtopic mining and document ranking subtasks of the NTCIR-9 Intent task, for both Chinese and Japanese languages. In the subtopic mining subtask, we experiment with a novel data-driven approach for ranking reformulations of an ambiguous query. In the document ranking subtask, we deploy our state-of-the-art xQuAD framework for search result diversification.
Microsoft Research Asia at the NTCIR-9 Intent Task
Jialong Han, Qinglei Wang, Naoki Orii, Zhicheng Dou, Tetsuya Sakai and Ruihua Song
[Pdf] [Table of Contents]

In NTCIR-9, we participate in the Intent task, including both the Subtopic Mining subtask and the Document Ranking subtask. In the Subtopic Mining subtask, we mine subtopics from query logs and top results of the queries, and rank them based on their relevance to the query and the similarity between them. In the Document ranking Subtask, we diversify top search results using the mined subtopics based on a general multi-dimensional diversification framework. Experimental results show that our best Chinese subtopic mining run is ranked No. 2 of all 42 runs in terms of D#nDCG@10. Our Chinese document ranking runs generally outperform other runs in terms of I-rec. Our best Chinese document ranking runs is the No. 4 of all 24 runs in terms of D#nDCG@10. Our Japanese document ranking runs perform the best both in terms of D-nDCG and in terms of D-nDCG.
THUIR at NTCIR-9 INTENT Task
Yufei Xue, Fei Chen, Tong Zhu, Chao Wang, Zhichao Li, Yiqun Liu, Min Zhang, Yijiang Jin and Shaoping Ma
[Pdf] [Table of Contents]

This is the first year IR group of Tsinghua University (THUIR) participates in NTCIR. We register the INTENT task and focus on the Chinese topics of subtopic mining and document ranking subtask. In our experiments, we try to mine subtopics from different resources, namely query recommendation, Wikipedia and the query-URL bipartite graph which is constructed by clickthrough data. We also develop some methods to re-rank the subtopics and remove reduplicate ones with query log and search result snippets in search engines. In the document ranking task, methods applied to diversify English documents are used to validate their effectiveness on Chinese pages, such as HITS, Novelty-Result Selection and Documents Duplication Elimination. Based on the new metric, called D#-nDCG, we propose a Document-Diversification algorithm to select the documents retrieved for subtopics mined in the subtopic mining task, and user browse logs are also leveraged to re-rank these selected results.
NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task
Chieh-Jen Wang, Yung-Wei Lin, Ming-Feng Tsai and Hsin-Hsi Chen
[Pdf] [Table of Contents]

Users express their information needs in terms of queries to find the relevant documents on the web. However, users’ queries are usually short, so that search engines may not have enough information to determine their exact intents. How to diversify web search results to cover users’ possible intents as wide as possible is an important research issue. In this paper, we will propose several subtopic mining approaches and show how to diversify the search results by the mined subtopics. For Subtopic Mining subtask, we explore various subtopic mining algorithms that mine subtopics of a query from enormous documents on the web. For Document Ranking subtask, we propose re-ranking algorithms that keep the top-ranked results to contain as many popular subtopics as possible. The re-ranking algorithms apply sub-topics mined from subtopic mining algorithms to diversify the search results. The best performance of our system achieves an I-rec@10 (Intent Recall) of 0.4683, a D-nDCG@10 of 0.6546 and a D#-nDCG@10 of 0.5615 on Chinese Subtopic Mining subtask of NTCIR-9 Intent task and an I-rec@10 of 0.6180, a D-nDCG@10 of 0.3314 and a D#-nDCG@10 of 0.4747 on Chinese Document Ranking subtask of NTCIR-9 Intent task. Besides, the best performance of our system achieves an I-rec@10 of 0.4442, a D-nDCG@10 of 0.4244 and a D#-nDCG@10 of 0.4343 on Japanese Subtopic Mining subtask of NTCIR-9 Intent task and an I-rec@10 of 0.5975, a D-nDCG@10 of 0.2953 and a D#-nDCG@10 of 0.4464 on Japanese Document Ranking subtask.
The KLE’s Subtopic Mining System for the NTCIR-9 INTENT Task
Se-Jong Kim, Hwidong Na and Jong-Hyeok Lee
[Pdf] [Table of Contents]

This paper describes our subtopic mining system for the NTCIR-9 INTENT task. We propose a method that mines subtopics for each topic only using the given Chinese query log. Our method finds possible subtopics and estimates scores of them based on interest and clearness. In the Chinese subtopic mining, our best values of D#-nDCG were 0.3823 for l = 10, 0.4413 for l = 20 and 0.4241for l = 30.
Qualifier Mining for NTCIR-INTENT
Haitao Yu, Fuji Ren and Song Liu
[Pdf] [Table of Contents]

We participated the Subtopic Mining subtask of NTCIR-9 INTENT task. Query Log is used as the primary resource to mine latent subtopics. Through analysis of query log, we observed that queries describing similar information needs will use a similar group of qualifiers, which may also frequently occur together within queries. We introduced the concept of qualifier graph for subtopic mining. To solve the sparseness problem, the search snippets returned by web search engines are used. The experiment results show that it is reasonable to make use of qualifier to mine latent subtopics.
RMIT and Gunma University at NTCIR-9 Intent Task
Michiko Yasukawa, J. Shane Culpepper, Falk Scholer and Matthias Petri
[Pdf] [Revised Paper Pdf 20111222] [Table of Contents]

In this report, we describe our experimental results for the NTCIR-9 INTENT task. For our experiments, we use our experimental search engine, NeWT. NeWT is a ranked self-index capable of supporting multiple languages by deferring linguistic decisions until query time. To our knowledge, this is the first Information Retrieval task on the ClueWeb09-JA collection performed entirely with ranked self-indexes.
Redundancy Removal to Selectively Diversify Information Retrieval Results
Xiao-Lin Wang, Hai Zhao and Bao-Liang Lu
[Pdf] [Table of Contents]

The document ranking subtask of NTCIR-9 Intent track is to rank retrieved documents to better satisfy users' multiple intents. We propose a redundancy removal algorithm to approach this task. The implemented system achieves average performance according to the official evaluation. Furthermore, analysis on evaluation results indicates the proposed algorithm does improve the diversity of retrieval results but slightly hurts the relevance.
UWaterloo at NTCIR-9: Intent discovery with anchor text
John A. Akinyemi and Charles L.A. Clarke
[Pdf] [Table of Contents]

This paper describes our submission to the Intent Discovery task at the NTCIR-9. By treating the source and target documents of anchor texts as nodes, we utilized the anchor texts between the nodes as edges in a documents?anchors graph representation of the corpus. We extracted and indexed anchor links information from the provided SogouT corpus. Using the queries, anchor texts are retrieved from the index. Other anchor texts that link to the target documents of retrieved anchor texts are also retrieved. All the anchor texts are ranked and grouped to eliminate duplicates and near duplicates.
HITSCIR System in NTCIR-9 Subtopic Mining Task
Wei Song, Yu Zhang, Handong Gao, Ting Liu and Sheng Li
[Pdf] [Table of Contents]

Web queries tend to have multiple user intents. Automatically identifying query intents will benefit search result navigation, search result diversity and personalized search. This paper presents the HITSCIR system in NTCIR-9 subtopic mining task. Firstly, the system collects query intent candidates from multiple resources. Secondly, Affinity Propagation algorithm is applied for clustering these query intent candidates. It could decide the number of clusters automatically. Each cluster has a representative intent candidate called exemplar. Prior preference and heuristic pair-wise preferences could be incorporated in the clustering framework. Finally, the exemplars are ranked by considering each own quality and the popularity of the clusters they represent. The NTCIR-9 evaluation results show that our system could effectively mine query intents with good relevance, diversity and readability.
The Report on Subtopic Mining and Document Ranking of NTCIR-9 Intent Task
Wei-Lun Xiao, Shih-Hung Wu, Liang-Pu Chen and Tsun Ku
[Pdf] [Table of Contents]

In this paper we report our approach and result as a participant of the NTCIR-9 Intent task. INTENT task is a new NTCIR task which consists of two subtasks: (1) Subtopic Mining subtask: given a query, a system lists all possible subtopics that might cover users’ different intents. Our approach is mining the query log to find subtopics candidates and rank them according to the frequencies of each candidate. (2) Document Ranking subtask: given a query, a system returns diversified document URLs that might cover users’ diversified intents. Since the document set is larger than the capacity of PC. Our approach is to construct a distributed framework that can search a partial document set by one PC at a time and merge the partial search results to get the final ranking list.
ISCAS at Subtopic Mining Task in NTCIR9
Xue Jiang, Xianpei Han and Le Sun
[Pdf] [Table of Contents]

In this paper, we describe our work at subtopic mining subtask in NTCIR-9 in simplified Chinese. To find possible subtopics of a specific query, we select related queries recorded by query log, or titles of searching results provided by Google and Baidu, or the catalog of corresponding entry in Baidu encyclopedia, which are lexically similar as the original query, then we apply k-means algorithm to cluster these candidate queries with different k (k=5, 10), and rank these queries with consideration of similarities and clusters.
Mining Search Subtopics from Query Logs
Dan Zhu, Jianwei Cui, Jun He, Xiaoyong Du and Hongyan Liu
[Pdf] [Table of Contents]

Web queries are usually short and ambiguous. Subtopic mining plays an important role in understanding user’s search intent and has attracted many researchers' attention. In this paper, we describe our approach to identify users’ intents from query logs, which is a subtopic mining subtask of the NTCIR-9 Intent task for Chinese. We extract queries that are semantically related to the original query from query log, measure their similarities based on their relationship with urls, and cluster them into groups which represent different subtopics. In the experiment section, we show the results of our method evaluated by the organizers and a case study for one of the NTCIR-9 intent queries. The results shows that most found intents are different. The proposed method is easy to implement in real applications and can be computed quickly.
HIT² Joint NLP Lab at the NTCIR-9 Intent Task
Dongqing Xiao, Haoliang Qi, Jingbin Gao, Zhongyuan Han, Muyun Yang and Sheng Li
[Pdf] [Table of Contents]

The report hereby is to represent the principle, the searching process and experiment results. We report our systems and experiments in the intent task of NTCIR 9. The research aims at evaluating the effectiveness of the proposed methods on query intent mining and results diversification in terms of web search. In the subtopic mining subtask, we combine the extracted candidates from search logs and Wikipedia. An improvement could be seen after incorporating query intents from different resources. In the document ranking subtask, greedy algorithms are taken to select documents with the high diversified score and return a re-ranked list of diversified documents based on query subtopics. The experiment results show that the method, that is combining subtopic results directly, outperforms MMR.
Overview of NTCIR-9 1CLICK
Tetsuya Sakai, Makoto P. Kato and Young-In Song
[Pdf] [Table of Contents]

This is an overview of the NTCIR-9 One Click Access (1CLICK) “pilot” task. In contrast to traditional Web search which requires the user to scan a ranked list of URLs, visit individual URLs and gather pieces of information he needs, 1CLICK aims to satisfy the user with a single textual output, immediately after the user clicks on the SEARCH button. Systems are expected to present important pieces of information first and to minimise the amount of text the user has to read. As our first trial, we designed a Japanese 1CLICK task with a nugget-based test collection. Three research teams participated in the task, using diverse approaches (information extraction, passage retrieval and summarisation). Our results suggest that the 1CLICK evaluation framework is a useful complement to traditional 10-blue-link evaluation. We therefore hope to expand the language scope in the next round, at NTCIR-10.
Information Extraction based Approach for the NTCIR-9 1CLICK Task
Makoto P. Kato, Meng Zhao, Kosetsu Tsukuda, Yoshiyuki Shoji, Takehiro Yamamoto, Hiroaki Ohshima and Katsumi Tanaka
[Pdf] [Table of Contents]

We describe a framework incorporating several information extraction methods for the NTCIR-9 One Click Access Task. Our framework first classifies a given query into pre-defined query classes, then extracts information from several Web resources by using a method suitable for the query type, and finally aggregates pieces of information into a short text.
TTOKU Summarization Based Systems at NTCIR-9 1CLICK task
Hajime Morita, Takuya Makino, Tetsuya Sakai, Hiroya Takamura and Manabu Okumura
[Pdf] [Table of Contents]

We describe our two query-oriented summarization systems implemented for the NTCIR-9 1CLICK task. We regard a Question Answering problem as a summarization process. Both of the systems are based on the integer linear programming technique, and consist of an abstractive summarization model and a model ensuring to cover diversified aspects for answering user’s complex question. The first system adopts QSBP (Query SnowBall with word pair) for query oriented summarization, and extends the method to abstractive summarization in order to recognize and extract the parts of a sentence that are related to the query. On the other hand, The second system ensures covering pre-defined several aspects of information needed by users. We decided the types of aspects depending on the category of a query. Our first and second systems achieved 0.1585 and 0.1484 S-measure(I) score respectively in the desktop task. Furthermore, our first and second systems achieved 0.0866 and 0.0829 S-measure(I) score respectively in the mobile task.
Microsoft Research Asia at the NTCIR-9 1CLICK Task
Naoki Orii, Young-In Song and Tetsuya Sakai
[Pdf] [Table of Contents]

Microsoft Research Asia participated in the 1CLICK task at NTCIR-9 using two dierent techniques: a statistical ranking approach and the utilization of semi-structured web knowledge sources. The evaluation results show the effectiveness of our approach: we found a 50% increase in S-measure relative to the baseline. We present a module-by-module error analysis, showing directions for future work.
Overview of the IR for Spoken Documents Task in NTCIR-9 Workshop
Tomoyosi Akiba, Hiromitsu Nishizaki, Kiyoaki Aikawa, Tatsuya Kawahara and Tomoko Matsui
[Pdf] [Table of Contents]

This paper describes an overview of the IR for Spoken Documents Task in NTCIR-9 Workshop. In this task, the spoken term detection (STD) subtask and ad-hoc spoken document retrieval subtask (SDR) are conducted. Both of the subtasks target to search terms, passages and documents included in academic and simulated lectures of the Corpus of Spontaneous Japanese. Finally, seven and five teams participated in the STD subtask and the SDR subtask, respectively. This paper explains the data used in the subtasks, how to make transcriptions by speech recognition and the details of each subtask.
Spoken Term Detection Using Multiple Speech Recognizers’ Outputs at NTCIR-9 SpokenDoc STD subtask
Hiromitsu Nishizaki, Yuto Furuya, Satoshi Natori and Yoshihiro Sekiguchi
[Pdf] [Revised Paper Pdf 20111212 ] [Table of Contents]

This paper describes spoken term detection (STD) with false detection control using a phoneme transition network (PTN) derived from multiple speech recognizers' outputs at NTCIR-9 SpokenDoc STD subtask. Using the output of multiple speech recognizers, the PTN method is effective at correctly detecting out-of-vocabulary (OOV) terms and is robust to certain recognition errors. However, it exhibits a high false detection rate. Therefore, we applied two false detection control parameters to the search engine that accepts the PTN-formed index. One of the parameters is based on the consept of the majority voting scheme, and the other is a measure of ambiguity in confusion networks (CN). These parameters improve the STD performance (F-measure value of 0.725) compared to that without any parameters (F-measure value of 0.714).
High speed spoken term detection by combination of n-gram array of a syllable lattice and LVCSR result for NTCIR-SpokenDoc
Keisuke Iwami and Seiichi Nakagawa
[Pdf] [Table of Contents]

For spoken document retrieval, it is very important to consider Out-of-Vocabulary (OOV) and mis-recognition of spoken words. Therefore, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a Japanese spoken document retrieval system that is robust for considering OOV words and mis-recognition of sub-units. Additionally our system combines Large Vocabulary Continuous Speech Recognizer(LVCSR) and a sub-word unit recognition system for In-Vocabulary(IV) words. We used individual syllables as sub-word unit in continuous speech recognition and an n-gram sequence of syllables in a recognized syllable-based lattice. We propose an n-gram indexing/retrieval method with distance in the syllable lattice for attacking OOV, recognition errors, and high speed retrieval. We applied this method to academic lecture presentation database of 44 hours, and 0.645(F-value) of the OOV words was obtained in less than 2.5 milliseconds per query.
Spoken document retrieval method combining query expansion with continuous syllable recognition for NTCIR-SpokenDoc
Satoru Tsuge, Hiromasa Ohashi, Norihide Kitaoka, Kazuya Takeda and Kenji Kita
[Pdf] [Table of Contents]

In this paper, we propose a spoken document retrieval method which combines query expansion with continuous syllable recognition. The proposed method expands a query by using words from the web pages collected by a search engine. It is assumed that relevant document vectors exist on the plane which is constructed from the query vector and the extended vector. The weight parameter between a target document vector and a query vector is calculated for query expansion. In addition, target documents are mapped not only to space constructed by continuous word speech recognition results, but also to space constructed by syllable speech recognition results. Then, the proposed method calculates a distance between the query vector and the document vector for each space and combines these distances. For evaluating the proposed method, we conducted spoken document retrieval experiments on the SpokenDoc task of the NTCIR-9 meeting. The experimental results showed that the proposed method improved the mean average precision score from the baseline provided by the meeting organizer of 0.392784 to 0.406085 when running the formal run of SpokenDoc task.
DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task
Maria Eskevich and Gareth J. F. Jones
[Pdf] [Table of Contents]

We describe details of our runs and the results obtained for the ""IR for Spoken Documents (SpokenDoc) Task"" at NTCIR-9. The focus of our participation in this task was the investigation of the use of segmentation methods to divide the manual and ASR transcripts into topically coherent segments. The underlying assumption of this approach is that these segments will capture passages in the transcript relevant to the query. Our experiments investigate the use of two lexical coherence based segmentation algorithms (TextTiling, C99). These are run on the provided manual and ASR transcripts, and the ASR transcript with stop words removed. Evaluation of the results shows that TextTiling consistently performs better than C99 both in segmenting the data into retrieval units as evaluated using the centre located relevant information metric and in having higher content precision in each automatically created segment.
Toward improvement of SDR accuracy using LDA and query expansion for SpokenDoc
Kiichi Hasegawa, Hideki Sekiya, Masanori Takehara, Taro Niinomi, Satoshi Tamura and Satoru Hayamizu
[Pdf] [Table of Contents]

This paper investigates several techniques for spoken document retrieval, toward improvement of retrieval performance based on the conventional method i.e. TF-IDF. The first approach employs rescaled unigrams of LDA to compute a similarity score. The second technique employs query expansion by web retrieval using Yahoo!API. And the third technique is Prioritized And-operator Retrieval based on TF-IDF techniques. We tested these methods using a dry-run data, then it turned out that the third technique is most promising.
STD based on Hough Transform and SDR using STD results: Experiments at NTCIR-9 SpokenDoc
Taisuke Kaneko, Tomoko Takigami and Tomoyosi Akiba
[Pdf] [Table of Contents]

In this paper, we report our experiments at NTCIR-9 IR for Spoken Documents (SpokenDoc) task. We participated both the STD and SDR subtasks of SpokenDoc. For STD subtask, we applied novel indexing method, called metric subspace indexing, previously proposed by us. One of the distinctive advantages of the method was that it could output the detection results in increasing order of distance without using any predefined threshold for the distance. The experimental results showed that the proposed method was very fast but there were rooms for improvement in the detection accuracy. For SDR subtask, two kinds of approaches were applied to both the lecture and passage level. The first approach used the conventional word-based IR methods based on the language modeling IR models. The second approach used the STD method for detecting the terms in the query from the spoken documents and then applied the IR methods using the detection as the term's appearances. The experimental results showed that, though the performance of the STD-based method was lower than the word-based approaches in total, it could improve the performance if the query topic included the out-of-vocabulary words.
Utilization of Suffix Array for Quick STD and Its Evaluation on the NTCIR-9 SpokenDoc Task
Kouichi Katsurada, Koudai Katsuura, Yurie Iribe and Tsuneo Nitta
[Pdf] [Table of Contents]

We propose a technique for detecting keywords quickly from a very large speech database without using a large-sized memory. For acceleration of search and saving the use of memory, we employed a suffix array as a data structure and applied phoneme-based DP-matching to it. To avoid exponential explosion of process time with the length of a keyword, a long keyword is divided into short sub-keywords. Moreover, iterative lengthening search algorithm is used for outputting the accurate search results fast.
Spoken Document Retrieval Experiments for SpokenDoc at Ryukoku University (RYSDT)
Hiroaki Nanjo, Kazuyuki Noritake and Takehiko Yoshimi
[Pdf] [Table of Contents]

In this paper, we describe spoken document retrieval systems in Ryukoku University, which were participated in NTCIR-9 IR for Spoken Documents (""SpokenDoc"") task. In NTCIR-9 ""SpokenDoc"" task, there are two subtasks: ""Spoken term detection (STD) subtask"" and ""Spoken document retrieval (SDR) subtask"". We participated in the both subtasks as team RYSDT. In this paper, first, our STD systems are described, and then, our SDR systems are described.
An STD system for OOV query terms using various subword units
Hiroyuki Saito, Takuya Nakano, Shirou Narumi, Toshiaki Chiba, Kazuma Kon'no and Yoshiaki Itoh
[Pdf] [Table of Contents]

We have been proposing a Spoken Term Detection (STD) method for Out-Of-Vocabulary (OOV) query terms using various subword units, such as monophone, triphone, demiphone, one third phone, and Sub-phonetic segment (SPS) models. In the proposed method, subword-based ASR is performed for all spoken documents and subword recognition results are generated using subword acoustic models and subword language models. When a query term is given, the subword sequence of the query term is searched for all subword sequence of subword recognition results of spoken documents. Here, we use acoustical distances between subwords when matching the two subword sequences in Continuous Dynamic Programming. Demiphone and one-third phone models were newly developed for an STD task. We have also proposed the method integrating plural STD results obtained using each subword models. Each candidate segment has a distance, the segment number and the document number. These plural distances are integrated linearly using weighting factors. In STD tasks of IR for Spoken Documents in NTCIR-9, we apply various subword models to the STD tasks and integrate plural STD results obtained from these subword models.
YLAB@RU at Spoken Term Detection Task in NTCIR-9
Yoichi Yamashita, Toru Matsunaga and Kook Cho
[Pdf] [Revised Paper Pdf 20111212] [Table of Contents]

The information retrieval based on speech recognition is an important technique to easy access to large amount of multimedia contents including speech. The development of spoken term detection (STD) techniques, which detect a given word or phrase from spoken documents, is widely conducted. This paper proposes a new method of STD based on the vector quantization (VQ). Spoken documents are represented as sequences of VQ codes, and they are matched with a text query to be detected based on the V-P score which measures the relationship between a VQ code and a phoneme. The representation of VQ codes is an intermediate form between acoustic features such as MFCC parameters and sub-word symbols which are often used in conventional STD methods. The dependency of acoustic features on a speaker is avoided by the speaker-dependent VQ.
Overview of NTCIR-9 RITE: Recognizing Inference in TExt
Hideki Shima, Hiroshi Kanayama, Cheng-Wei Lee, Chuan-Jie Lin, Teruko Mitamura, Yusuke Miyao, Shuming Shi and Koichi Takeda
[Pdf] [Revised Paper Pdf 20120424] [Table of Contents]

This paper introduces an overview of the RITE (Recognizing Inference in TExt) task in NTCIR-9. We evaluate systems that automatically recognize entailment, paraphrase, and contradiction between two texts written in Japanese, Simplified Chinese, or Traditional Chinese. The task consists of four subtasks: Binary classification of entailment (BC); Multi-class classification including paraphrase and contradiction (MC); and two extrinsic application-oriented datasets: Entrance Exam and RITE4QA. This paper also describes how we built the test collection, evaluation metrics, and evaluation results of the submitted runs.
A Machine Learning based Textual Entailment Recognition System of JAIST Team for NTCIR9 RITE
Quang Nhat Minh Pham, Le Minh Nguyen and Akira Shimazu
[Pdf] [Revised Paper Pdf 20111206] [Table of Contents]

NTCIR9-RITE is the first shared-task of recognizing textual inference in text written in Japanese, Simplified Chinese, or Traditional Chinese. JAIST team participates in three subtasks for Japanese: Binary-class, Entrance exam and RITE4QA. We adopt a machine learning approach for these subtasks, combining various kinds of entailment features by using machine learning techniques. In our system, we use a Machine Translation engine to automatically produce English translation of the Japanese data, and both original Japanese data and its translation are used to train an entailment classifier. Experimental results show the effectiveness of our method. Although our system is lightweight and does not require deep semantic analysis or extensive linguistic engineering, it obtained the first rank (accuracy of 58%) among participant groups on the Binary-class subtask for Japanese.
Predicate-argument Structure based Textual Entailment Recognition System of KYOTO Team for NTCIR9 RITE
Tomohide Shibata and Sadao Kurohashi
[Pdf] [Table of Contents]

We participated in Japanese tasks of RITE in NTCIR9 (team id: “KYOTO”). Our proposed method regards predicate-argument structure as a basic unit of handling the meaning of text/hypothesis, and performs the matching between text and hypothesis. Our system first performs predicate-argument structure analysis to both a text and a hypothesis. Then, we perform the matching between text and hypothesis. In matching text and hypothesis, wide-coverage relations between words/phrases such as synonym and is-a are utilized, which are automatically acquired from a dictionary, Web corpus and Wikipedia.
UIOWA at NTCIR-9 RITE: Using the Power of the Crowd to Establish Inference Rules
Christopher G. Harris
[Pdf] [Table of Contents]

We participated in the Binary Classification (BC), Multiple Classification (MC), and Question and Answer (RITE4QA) subtasks for both Simplified Chinese and Traditional Chinese in NTCIR-9 RITE. In this paper, we describe our procedure to establish inference rules using crowdsourcing, refine and weigh them, and apply these rules to a test collection.
ICRC_HITSZ at RITE: Leveraging Multiple Classifiers Voting for Textual Entailment Recognition
Yaoyun Zhang, Jun Xu, Chenlong Liu, Xiaolong Wang, Ruifeng Xu, Qingcai Chen, Xuan Wang, Yongshuai Hou and Buzhou Tang
[Pdf] [Table of Contents]

The NTCIR-9 RITE challenge is a generic benchmark task that evaluates systems’ ability to automatically detect textual entailment, paraphrase and contradiction. This paper describes the ICRC_HITSZ system for RITE. We participate in the binary-class (BC), the multi-class (MC) and the RITE4QA subtask. More specifically, we build textual entailment recognition models for the MC subtask. The predicted multiple class labels are then mapped into Yes/No labels for the BC and RITE4QA subtasks. Different linguistic level features are extracted by using hybrid NLP resources and tools. Based on the hierarchical relations between the labels of the MC subtask, three different classification strategies are designed. Multiple machine learning methods are employed for each strategy. On the assumption that classifiers built from different classification strategies are complementary to each other, so are the different machine learning methods. The final classifier is built with a cascade voting. Evaluation results show that the voting strategies are effective, with the highest performance ranked at the fourth place in terms of accuracy, and at the second place in terms of participant groups in both tasks.
NTTCS Textual Entailment Recognition System for NTCIR-9 RITE
Yasuhiro Akiba, Hirotoshi Taira, Sanae Fujita, Kaname Kasahara and Masaaki Nagata
[Pdf] [Table of Contents]

This paper describes initial Japanese Textual Entailment Recognition (RTE) systems that participated Japanese Binary-class (BC) and Multi-class (MC) subtasks of NTCIR-9 RITE. Our approaches are based on supervised learning techniques: Decision Tree (DT) and Support Vector Machine (SVM) learners. The employed features for the learners include text fragment based features such as lexical, syntactic, and semantic ones and new surface/deep case structure based features. These features are designed so as to assign entailment directions to a text pair in MC subtask. The authors submitted three runs to each of BC and MC subtasks. The best performance in the three runs achieves an accuracy of 0.548 in BC subtask and 0.452 in MC subtask, which were better than the averaged accuracy of all team submissions.
FudanNLP at RITE 2011: a Shallow Semantic Approach to Textual Entailment
Ling Cao, Xipeng Qiu and Xuanjing Huang
[Pdf] [Table of Contents]

RITE is a task recognizing logic relations between texts. This paper presents FDCS’s approach on NTCIR9-RITE Chinese simplified BC & MC subtasks. Our system is built on a machine learning architecture with features selected on shallow semantic methods, including named entity recognition, date & time expression extraction, words overlap and negation concept recognition. FudanNLP is wildly used by our system on NLP procession and feature extraction. System gets accuracy as 76% on BC subtask and 58.5% on MC subtask, separately.
IMTKU Textual Entailment System for Recognizing Inference in Text at NTCIR-9 RITE
Min-Yuh Day, Re-Yuan Lee, Cheng-Tai Liu, Chun Tu, Chin-Sheng Tseng, Loong Tern Yap, Allen-Green C.L. Huang, Yu-Hsuan Chiu and Wei-Ze Hong
[Pdf] [Table of Contents]

In this paper, we describe the IMTKU (Information Management at TamKang University) textual entailment system for recognizing inference in text at NTCIR-9 RITE (Recognizing Inference in Text). We proposed a textual entailment system using a hybrid approach that integrate knowledge based and machine learning techniques for recognizing inference in text at NTCIR-9 RITE task. We submitted 3 official runs for both BC and MC subtask. In NTCIR-9 RITE task, IMTKU team achieved 0.522 in the CT-MC subtask and 0.556 in the CT-BC subtask.
The Yuntech System in NTCIR-9 RITE Task
Nai-Hsuan Han and Lun-Wei Ku
[Pdf] [Table of Contents]

NTCIR-9 RITE task evaluates systems which automatically detect entailment, paraphrase, and contradiction in texts. The Yuntech team developed a preliminary system for the NTCIR-9 RITE task and was described in this paper. The major aim of this system was to determine the type of the relation of two sentences. A straightforward assumption was proposed for achieving this aim: the relation between two sentences was determined by the different parts between them instead of the identical parts. Therefore, we considered features including sentence lengths, the content of matched keywords, quantities of matched keywords, and their parts of speech to capture the difference between two sentences. Rule-based methods were implemented to develop the system according to the proposed assumption, and good performances were achieved for some types.
NTU Textual Entailment System for NTCIR 9 RITE Task
Hen-Hsen Huang, Kai-Chun Chang, James M.C. Haver II and Hsin-Hsi Chen
[Pdf] [Table of Contents]

In this paper, we propose a system to deal with the Chinese textual entailment problem for NTCIR-9 RITE task. The RITE task consists of four subtasks, simplified Chinese binary classification (CS_BC), simplified Chinese multi-way classification (CS_MC), traditional Chinese binary classification (CT_BC), and traditional Chinese multi-way classification (CT_MC). According to the definitions of these subtasks, a machine learning based classification framework is proposed and tested under various setups. The performance of our system in the formal run achieves accuracies of 73.5%, 57.5%, 60.8%, and 48.3% for CS_BC, CS_MC, CT_BC, and CT_MC respectively.
The Description of the NTOU RITE System in NTCIR-9
Chuan-Jie Lin and Bo-Yu Hsiao
[Pdf] [Table of Contents]

The textual entailment system determines whether one sentence can entail another in a common sense. We proposed several approaches to train textual entailment classifiers, including setting ancestor distance threshold, expanding training corpus, using different sets of features, and tuning classifier settings. The results show that a MC classifier trained by using an expanded training corpus and scoring features performs the best with an accuracy of 64.22% in BC task and 46.11% in MC task.
WUST SVM-Based System at NTCIR-9 RITE Task
Maofu Liu, Yan Li, Yu Xiao and Chunwei Lei
[Pdf] [Table of Contents]

This paper describes our work in NTCIR-9 on RITE Binary-class (BC) subtask and Multi-class (MC) subtask in Simplified Chinese. We use classification method and SVM classifier to identify the textual entailment. We totally use thirteen statistical features as the classification features in our system. The system includes three parts: (1) Preprocessing, (2) Feature Extraction, (3) SVM Classifier. In these three parts, we mainly focus on the second one.
Recognizing Text Entailment via Syntactic Tree Matching
Zhewei Mai, Yue Zhang and Donghong Ji
[Pdf] [Table of Contents]

In this paper, we present our approach for Chinese Binary-class (BC) subtask of Recognizing Inference in Text (RITE) task in NTCIR-9 [9]. Our system is to judge whether a given sentence can entail another or not. Firstly each sentence is parsed to a syntactic tree in which nodes represent words or phrase, and links represent syntactic relationships between nodes. Then, the entailment between two sentences is recognized by syntactic tree matching. In addition, to compute the similarity between two words or phrases, the external resources (Tongyici Cilin, HowNet, and Hudong Wiki) are employed. The evaluation results show that our system can reach the accuracy of 87.1% in recognizing pairs with entailment relationship.
A Textual Entailment System using Web based Machine Translation System
Partha Pakray, Snehasis Neogi, Sivaji Bandyopadhyay and Alexander Gelbukh
[Pdf] [Table of Contents]

The article presents the experiments carried out as part of the participation in Recognizing Inference in Text (NTCIR-9 RITE) @NTCIR9 for Japanese. NTCIR-9 RITE has four subtasks, Binary-class (BC) subtask, Multi-class (MC) subtask, Entrance Exam and NTCIR-9 RITE4QA. We have submitted a total of three unique runs (Run 1, Run 2 and Run 3) in the BC subtask and one run each in the MC Subtask, Entrance Exam subtask and NTCIR-9 RITE4QA subtask. The first system for BC subtask is based on Machine Translation using the web based Bing translator system. The second system for the BC subtask is based on lexical matching. The third system is based on a voting approach on the outcomes of the first and the second system. The system for MC subtask is based on a learned system that uses different lexical similarity features like Word Net based Unigram Matching, Bigram Matching, Trigram Machine, Skip-gram Matching, LCS Matching and Named Entity (NE) Matching. For Entrance Exam and NTCIR-9 RITE4QA subtask, we develop a single system based on the Ngram matching module similar to the second system of the BC subtask. For the BC subtask, the accuracy for Run 1, Run 2 and Run 3 are 0.490, 0.500 and 0.508 respectively. For the MC subtask, the accuracy is 0.175. The accuracy figures of the Entrance Exam subtask and the NTCIR-9 RITE4QA subtask are 0.5204 and 0.5954 respectively.
The WHUTE System in NTCIR-9 RITE Task
Han Ren, Chen Lv and Donghong Ji
[Pdf] [Table of Contents]

This paper describes our system of recognizing textual entailment for RITE Chinese subtask at NTCIR-9. We build a textual entailment recognition framework and implement a system that employs string, syntactic, semantic and some specific features for the recognition. To improve the system’s performance, a two-stage recognition strategy is utilized, which first judge entailment or no entailment, and then contradiction or independence of the pairs in turn. Official results show that our system achieves a 73.71% performance in BC subtask, 60.93% in MC subtask and 48.76% in RITE4QA subtask.
IASL RITE System at NTCIR-9
Cheng-Wei Shih, Cheng-Wei Lee, Ting-Hao Yang and Wen-Lian Hsu
[Pdf] [Table of Contents]

We developed a knowledge-based textual inference recognition system for both BC and MC subtasks at NTCIR-9 RITE. Five different modules, which use named entities, subject-modifier word pairs, negative expressions, exclusive tokens and sentence length respectively, were implemented to determine the entailment relation of each sentence pair. Three decision making approaches were applied to integrate all the results from the recognition modules into one entailment result. The evaluation result showed that our system achieved 0.661 and 0.501 for traditional Chinese BC and MC subtasks respectively. For the simplified Chinese, the accuracy reached 0.715 and 0.565 for BC and MC respectively.
LTI's Textual Entailment Recognizer System at NTCIR-9 RITE
Hideki Shima, Yuanpeng Li, Naoki Orii and Teruko Mitamura
[Pdf] [Table of Contents]

This paper describes the LTI’s system participated in NTCIR-9 RITE. The system is based on multiple linguistically-motivated features and an adaptable framework for different datasets. The formal run scores are 54.6% (accuracy in BC), 66.7% (accuracy in Entrance Exam), and 29.8% (MRR in RITE4QA) which outperformed strong baselines, and are relatively good among participants. We also describe in-house experimental results (e.g. ablation study for measuring feature contribution).
ZSWSL Text Entailment Recognizing System at NTCIR-9 RITE Task
Ranxu Su, Sheng Shang, Pan Wang, Haixu Liu and Yan Zheng
[Pdf] [Table of Contents]

This paper describes our system on simplified Chinese textual entailment recognizing RITE task at NTCIR-9. Both lexical and semantic features are extracted using NLP methods. Three classification models are used and compared for the classification task, Rule-based algorithms, SVM and C4.5. C4.5 gives the best result on testing data set. Evaluation at NTCIR-9 RITE shows 72% accuracy on BC subtask and 61.9% accuracy on MC subtask.
Experiments for NTCIR-9 RITE Task at Shibaura Institute of Technology
Toru Sugimoto
[Pdf] [Table of Contents]

This paper reports the evaluation results of our textual entailment system at NTCIR-9 RITE task. We participated in the Japanese Binary-Class (BC) subtask. In our system, the meaning of a text is represented as a set of dependency triples consisting of two words and their relation. Comparing two sets of dependency triples with respect to conceptual and character-based similarity, a subsumption score is calculated and used to identify textual entailment. This paper provides a description of our algorithm, the evaluation results, and discussion on the results.
Syntactic Difference Based Approach for NTCIR-9 RITE Task
Yuta Tsuboi, Hiroshi Kanayama, Masaki Ohno and Yuya Unno
[Pdf] [Revised Paper Pdf 20111220] [Table of Contents]

This paper describes the IBM team's approach for the textual entailment recognition task (RITE) in NTCIR-9 with experimental results for four Japanese subtasks: BC, MC, EXAM, and RITE4QA. To tackle the data set with complicated syntactic and semantic phenomena, the authors used a classification method to predict entailment relations between two different texts. These features were used for classification: (1) Tree edit distance and operations, (2) Word overlap ratios and word pairs, (3) Sentiment polarity matching, (4) Character overlap ratios, (5) Head comparisons, (6) Predicate-argument structure matching, and (7) Temporal expression matching. Feature (1) reflects the operations in the edit distance computation between the text and the hypothesis, which can capture the syntactic differences between two sentences. In the RITE task, Feature (1) is effective for the MC subtask and Feature (7) is effective for the EXAM subtask.
Experiments of FX for NTCIR-9 RITE Japanese BC Subtask
Hiroshi Umemoto and Keigo Hattori
[Pdf] [Table of Contents]

We report results and analyses of our experiments for NTCIR-9 RITE Japanese BC subtask we have participated in. It is assumed that the RTE task can be analyzed at some level of accuracy with a simple string-based method using word coverage, although the task seems to require advanced natural language understanding. On the other hand, if you try to tackle the task in the manner to follow your intuition, you should consider at least syntactic features of the texts. However, it is difficult in general to obtain better results for textual inference (TI) with syntactic analysis rather than with word-level analysis. In this paper, we explain our TI method based on syntactic and semantic relations of words in the texts, and conduct experiments.
TU Group at NTCIR9-RITE: Leveraging Diverse Lexical Resources for Recognizing Textual Entailment
Yotaro Watanabe, Junta Mizuno, Eric Nichols, Katsuma Narisawa, Keita Nabeshima and Kentaro Inui
[Pdf] [Table of Contents]

This paper describes the TU system that participated in the Entrance Exam Subtask of NTCIR-9 RITE. The system consists of two phases: alignment and entailment relation recognition. In the alignment phase, the system aligns words in the two sentences by exploiting diverse lexical resources such as entailment information, hypernym-hyponym relations and synonyms. Based on the alignments and relations between them, the system recognizes semantic relations between two sentences. Our system achieved an accuracy of 0.672 on the development data, and an accuracy of 0.6493 on the formal run.
Binary-class and Multi-class Chinese Textural Entailment System Description in NTCIR-9 RITE
Shih-Hung Wu, Wan-Chi Huang, Liang-Pu Chen and Tsun Ku
[Pdf] [Table of Contents]

In this paper, we describe the details of our system for NTCIR-9 RITE. We sent 3 runs for each of the four sub-tasks: CT-BC, CT-MC, CS-BC, and CS-MC. Our approach to the NTCIR-9 RITE task is based on the standard supervised learning classification. We integrate available computational linguistic resources of Chinese language processing to build the system in a statistical natural language processing approach. First, we observed the training corpus and list all possible features. Second, we test the features on training data and find features that can be used to identify textual entailment. The features include surface text, semantic and syntactical information, such as POS tagging, NER tagging, and dependency relation. An automatic annotation subsystem is built to annotate the training corpus. Finally, the annotated data is used in training statistical models and build the classifier for the RITE 1 subtasks.
MCU at NTCIR: A Resources Limited Chinese Textual Entailment Recognition System
Yu-Chieh Wu, Chung-Jung Lee and Yaw-Chu Chen
[Pdf] [Table of Contents]

Recognizing inference in text is the task of finding the textual entailment relation between the given hypothesis and text fragments. Developing a high-performance text paraphrasing system usually requires rich external knowledge such as syntactic parsing, thesaurus which is limited in Chinese since the Chinese word segmentation problem should be resolve first. In this paper, we go different line. We propose a pattern-based and support vector machine-based trainable text entailment tagging framework under the condition of part-of-speech tagging information is available. We derive two exclusive feature sets for learners. One is the operations between the text pairs, while the other adopted the traditional bag-of-words model. Finally, we train the classifier with the above features. The official results indicate the effectiveness of our method. In terms of accuracy, our method achieves 53.6% for Traditional Chinese MC task (second place) and 55.4% for Traditional Chinese BC task. After the correction, our method in BC task is 67.9% with the same setting.
ICL Participation at NTCIR-9 RITE
Xu Xing and Wang Houfeng
[Pdf] [Table of Contents]

This paper describes ICL’s participation at NTCIR-9 RITE. We chose BC & MC subtask. Textual entailment is a problem to predict whether an entailment holds for a given test-hypothesis pair. We built an inference model to solve this problem by means of using dependency syntax analysis (by LTP), lexical knowledge base (e.g. CCD), web information (e.g. Baidupedia) and probability method. We used AUC indicator to evaluate the ranking ability of our system.
Overview of the NTCIR-9 Crosslink Task: Cross-lingual Link Discovery
Ling-Xiang Tang, Shlomo Geva, Andrew Trotman, Yue Xu and Kelly Y. Itakura
[Pdf] [Table of Contents]

This paper presents an overview of NTCIR-9 Cross-lingual Link Discovery (Crosslink) task. The overview includes: the motivation of cross-lingual link discovery; the Crosslink task definition; the run submission specification; the assessment and evaluation framework; the evaluation metrics; and the evaluation results of submitted runs. Cross-lingual link discovery (CLLD) is a way of automatically finding potential links between documents in different languages. The goal of this task is to create a reusable resource for evaluating automated CLLD approaches. The results of this research can be used in building and refining systems for automated link discovery. The task is focused on linking between English source documents and Chinese, Korean, and Japanese target documents.
Using Concept base and Wikipedia for Cross-Lingual Link Discovery
Pham Huy Anh and Takashi Yukawa
[Pdf] [Table of Contents]

This paper describes our method for the Cross-Lingual Link Discovery (CLLD). We used English-Japanese document collections in CLLD subtask of NTCIR-9. The topics in our method are translated by Wikipedia. Wikipedia is written by multi-language. In our method, the page written by the target language is retrieved for each topic written in the source language. The topic written in the target language is made from Wikipedia concept part of this page. Cross-language link is retrieved by a TF-IDF model. We use nouns, nouns phrase and adjective to make concept base. Re-ranked result retrieved by TF-IDF model. TF-IDF and concept base are made from the outline part of Wikipidia pages, which are written in the target language extracted in Wikipidia pages collection. Crosslink Evaluation Tool of NTCIR-9 Crosslink Task is utilized for performance evaluation.
IISR Crosslink Approach at NTCIR 9 CLLD Task
Chun-Yuan Cheng, Yu-Chun Wang and Richard Tzong-Han Tsai
[Pdf] [Table of Contents]

In this paper, we describe our approach to the English-Korean Cross-Lingual Link Discovery (CLLD) task in NTCIR 9. We propose a simple and effective approach to discover the links. Our method comprises preprocessing steps, anchor-target link mapping, and the ranking steps. For discovering the links, we use the English anchor names, the inter-language links, and the translation by the Google Translate as features and extract the possible links with the exactly matching among them. Our method also ranks the anchor candidates by the Wikipedia category sets and the PageRank method, and we select the Korean target pages with the mutual information between English anchors and Korean titles of Wikipedia articles. The official file-to-file evaluation with the manual assessment of our system is achieved from 0.6 to 0.7 in P10 precision, which shows that our approach can achieve satisfactory results.
HITS’ Graph-based System at the NTCIR-9 Cross-lingual Link Discovery Task
Angela Fahrni, Vivi Nastase and Michael Strube
[Pdf] [Table of Contents]

This paper presents HITS' system for the NTCIR-9 cross-lingual link discovery task. We solve the task in three stages: (1) anchor identification and ambiguity reduction, (2) graph-based disambiguation combining different relatedness measures as edge weights for a maximum edge weighted clique algorithm, and (3) supervised relevance ranking. In the file-to- file evaluation with Wikipedia ground-truth the HITS system is the top-performer across all measures and subtasks (English-2-Chinese, English-2-Japanese and English-2-Korean). In the file-2- file and anchor-2- file evaluation with manual assessment, the system outperforms all other systems on the English-2-Japanese subtask and is one of the top-three performing systems for the two other subtasks.
English-to-Korean Cross-linking of Wikipedia Articles at KSLP
In-Su Kang and Ralph Marigomen
[Pdf] [Table of Contents]

This paper describes team KSLPs approach for the NTCIR-9 English-to-Korean cross-lingual link detection task. There are three main steps that compose the whole system. Given an English topic document, first is identifying English anchor terms by exploiting both context-less and contextual link evidences from the English Wikipedia corpus. Then, by utilizing English-to-Korean translation dictionaries we obtain Korean translations for each identified anchor terms. Lastly, we disambiguate these translations by computing for the document similarity between its respective Korean document and the English topic document.
Cross-lingual Link Discovery by Using Link Probability and Bilingual Dictionary
Sin-Jae Kang
[Pdf] [Table of Contents]

This paper presents a method to discover English to Korean cross-lingual links by using resources such as link probability, title lists of Wikipedia articles, and an English-Korean bilingual dictionary.
UKP at CrossLink: Anchor Text Translation for Cross-lingual Link Discovery
Jungi Kim and Iryna Gurevych
[Pdf] [Table of Contents]

This paper describes UKP's participation in the cross-lingual link discovery (CLLD) task at NTCIR-9. The given task is to find valid anchor texts from a new English Wikipedia page and retrieve the corresponding target Wiki pages in Chinese, Japanese, and Korean languages. We have developed a CLLD framework consisting of anchor selection, anchor ranking, anchor translation, and target discovery subtasks, and discovered anchor texts from English Wikipedia pages and their corresponding targets in Chinese, Japanese, and Korean languages. For anchor selection, anchor ranking, and target discovery, we have largely utilized the state-of-the-art monolingual approaches. For anchor translation, we utilize a translation resource constructed from Wikipedia itself in addition to exploring a number of methods that have been widely used for short phrase translation. Our formal runs performed very competitively compared to other participants' systems. Our system came first in the English-2-Chinese and the English-2-Korean F2F with manual assessment and A2F with Wikipedia ground truth assessment evaluations using Mean-Average-Precision (MAP) measure.
KMI, The Open University at NTCIR-9 CrossLink: Cross-Lingual Link Discovery in Wikipedia Using Explicit Semantic Analysis
Petr Knoth, Lukas Zilka and Zdenek Zdrahal
[Pdf] [Table of Contents]

This paper describes the methods used in the submission of Knowledge Media institute (KMI), The Open University to the NTCIR-9 Cross-Lingual Link Discovery (CLLD) task entitled CrossLink. KMI submitted four runs for link discovery from English to Chinese; however, the developed methods, which utilise Explicit Semantic Analysis (ESA), are applicable also to other language combinations. Three of the runs are based on exploiting the existing cross-lingual mapping between different versions of Wikipedia articles. In the fourth run, we assume information about the mapping is not available. Our methods achieved encouraging results and we describe in detail how their performance can be further improved. Finally, we discuss two important issues in link discovery: the evaluation methodology and the applicability of the developed methods across different textual collections.
Discovering Links by Context Similarity and Translated Key Phrases for NTCIR9 CrossLink
Yi-Hsun Lee, Chung-Yao Chuang, Cen-Chieh Chen and Wen-Lian Hsu
[Pdf] [Table of Contents]

This paper describes our participation in the NTCIR-9 Cross-lingual Link Discovery (CrossLink) task. The task is focused on suggesting links between English Wikipedia and Chinese, Korean, and Japanese Wikipedia. In this event, we experimented our method on the English-to-Chinese subtask. Our method divides the link discovery process into three steps. First, we use a frequency-based anchor tagger to find phrases or pieces of text that may be viable for linking to other pages in the source language (English in our case.) Because there may be more than one page that an anchor can link to, we narrow down those pages by similarity in context overlapping. Next, we extract key phrases from remaining pages in the last step and translate those phrases using Google Translate into target language. The translated phrases are then used as query to retrieve articles in Chinese indexed by Lucene. Finally, we utilize a ranking algorithm based on pages’ connection graph to sort the candidate articles. Our system achieved MAP score 0.225 when evaluating with Wikipedia ground truth, and 0.205 with manual assessment.
WUST EN-CS Crosslink System at NTCIR-9 CLLD Task
Maofu Liu, Le Kang, Shuang Yang and Hong Zhang
[Pdf] [Table of Contents]

This paper describes our work in NTCIR-9 on the task of Cross-Lingual Link Discovery (Crosslink/CLLD). The work mainly focuses on two aspects to accomplish this task: (1) How to collect useful data for Crosslink and (2) How to use the data correctly and effectively. The system firstly uses online data collecting and text mining in Chinese Wikipedia articles to build the basic Crosslink database. And then these data and two-way expansion algorithm will be applied to identify the anchors and find out the relevant corresponding matchers.
Automated Cross-lingual Link Discovery in Wikipedia
Ling-Xiang Tang, Daniel Cavanagh, Andrew Trotman, Shlomo Geva, Yue Xu and Laurianne Sitbon
[Pdf] [Table of Contents]

At NTCIR-9, we participated in the cross-lingual link discovery (Crosslink) task. In this paper we describe our approaches to discovering Chinese, Japanese, and Korean (CJK) cross-lingual links for English documents in Wikipedia. Our experimental results show that a link mining approach that mines the existing link structure for anchor probabilities and relies on the ""translation" using cross-lingual document name triangulation performs very well. The evaluation shows encouraging results for our system.
Multi-filtering Method Based Cross-lingual Link Discovery
Yingfan Gao, Hongjiao Xu, Junsheng Zhang and Huilin Wang
[Pdf] [Table of Contents]

This paper describes cross-lingual link discovery method of ISTIC used in the system evaluation task at NTCIR-9. In this year's evaluation, we participated in cross-lingual link discovery task from English to Chinese. In this paper, we mainly describe our understanding for CLLD, the key techniques of our system, and the evaluation results.
Overview of the VisEx task at NTCIR-9
Tsuneaki Kato, Mitsunori Matsushita and Hideo Joho
[Pdf] [Table of Contents]

Interactive Visual Exploration (VisEx) is a pilot task at NTCIR-9 for establishing an efficient and effective framework for objectively evaluating interactive and explorative information access environments. It aims to acquire more useful and richer evaluation data based on empirical user studies, by adopting a common framework for the environments and conducting sophisticated experiments. Four teams participated in this task. Although it was harder to understand the results and draw a clear conclusion than expected, we learned much and have made useful progress.
Grid-based Interaction for NTCIR-9 VisEx Task
Hideo Joho and Tetsuya Sakai
[Pdf] [Table of Contents]

This paper presents a grid-based interaction model that is designed to encourage searchers to organize a complex search space by managing n x m sub spaces. A search interface was developed based on the proposed interaction model, and its performance was evaluated by a user study carried out in the context of the NTCIR-9 VisEx Task. This paper reports findings from the experiment and discusses future directions of the research on the proposed model.
How Does a User Utilize Chart-based Interface to Conduct Exploratory Data Analysis?
Kazuhiro Tanaka, Daiki Hasui and Mitsunori Matsushita
[Pdf] [Table of Contents]

This paper proposes an information retrieval system which utilizes a chart-based interface to support a user's exploratory data analysis. In such analysis, a user tends to examine data to support his/her hypothesis and to seek new perspectives about the data. In such a process, a user usually overviews data from various viewpoints first, compares multiple data to find differences/similarities among them, and then access the detailed information. Our proposed system aims to support such the process by providing a chart-based interface, which displays two charts side by side and permits a user to overlay the two charts to compare them, and then access the detailed information. This paper presents a basic framework of the task, requirement of support system for the task, and the detail of our proposed system. This paper reports a result of the users' study conducted in VisEx task by analysing the obtained data, including log records and post-questionnaires.
Read Article Management in Document Search Process for NTCIR-9 VisEx Task
Yasufumi Takama, Shunichi Hattori and Ryosuke Miyake
[Pdf] [Table of Contents]

This paper reports the result of taking part in NTCIR-9 VisEx task. VisEx is a pilot task for establishing an evaluation framework of explorative information access environments. We took part in an event collection subtask, in which users write a report that summarizes events relating with a given topic. In order to help users filter out unwanted articles from retrieved results, the developed system is equipped with facilities of managing read/unread state of articles. Its effect on users performing tasks is evaluated through the comparison with other system, especially baseline system.
Effects of the Variety of Document Retrieval Methods on Interactive Information Access - An Experiment in the NTCIR-9 VisEx Task -
Tsuneaki Kato
[Pdf] [Table of Contents]

An experiment was conducted by participating in the NTCIR-9 VisEx task, which examines how different document retrieval methods provided by an information access environment influence the results and process of interactive information access. The experiment compared the baseline system, which only provides a common keyword-based retrieval method, with our experimental UTLIS system, which provides, in addition to keyword-based retrieval, narrowing-down of obtained documents by specifying related place names and publication dates, and similarity-based retrieval. A preliminary analysis of the data shows that in the VisEx task, these retrieval methods changed the process of interactive information access without significantly affecting the amount and quality of information obtained.
Overview of the Patent Machine Translation Task at the NTCIR-9 Workshop
Isao Goto, Bin Lu, Ka Po Chow, Eiichiro Sumita and Benjamin K. Tsou
[Pdf] [Table of Contents]

This paper gives an overview of the Patent Machine Translation Task (PatentMT) at NTCIR-9 by describing the test collection, evaluation methods, and evaluation results. We organized three patent machine translation subtasks: Chinese to English, Japanese to English, and English to Japanese. For these subtasks, we provided large-scale test collections, including training data, development data and test data. In total, 21 research groups participated and 130 runs were submitted. We conducted human evaluations of adequacy and acceptability, as well as automatic evaluation.
BBN’s Systems for the Chinese-English Sub-task of the NTCIR-9 PatentMT Evaluation
Jeff Ma and Spyros Matsoukas
[Pdf] [Table of Contents]

This paper describes the work we conducted for building a statistical machine translation (SMT) system for the Chinese-English sub-task of the NTCIR-9 patent machine translation (MT) evaluation. We first applied the various techniques on patent data that we had developed for improving SMT performance on other types of data. Our results show that most of the techniques work on patent document translation as well. Second we made changes to our SMT system training in order to address special characteristics of patent documents. The changes produced additional improvements.
NTT-UT Statistical Machine Translation in NTCIR-9 PatentMT
Katsuhito Sudoh, Kevin Duh, Hajime Tsukada, Masaaki Nagata, Xianchao Wu, Takuya Matsuzaki and Jun'ichi Tsujii
[Pdf] [Table of Contents]

This paper describes details of the NTT-UT system in NTCIR- 9 PatentMT task. One of its key technology is system combination; the final translation hypotheses are chosen from n-bests by different SMT systems in a Minimum Bayes Risk (MBR) manner. Each SMT system includes different technology: syntactic pre-ordering, forest-to-string translation, and using external resources for domain adaptation and target language modeling.
The NiuTrans Machine Translation System for NTCIR-9 PatentMT
Tong Xiao, Qiang Li, Qi Lu, Hao Zhang, Haibo Ding, Shujie Yao, Xiaoming Xu, Xiaoxu Fei, Jingbo Zhu, Feiliang Ren and Huizhen Wang
[Pdf] [Table of Contents]

This paper describes the NiuTrans system developed by the Natural Language Processing Lab at Northeastern University for the NTCIR-9 Patent Machine Translation task (NTCIR-9 PatentMT). We present our submissions to the two tracks of NTCIR-9 PatentMT, and show several improvements to our phrase-based Statistical MT engine, including: a hybrid reordering model, large-scale language modeling, and combination of Statistical approaches and Example-based approaches for patent MT. In addition, we investigate the issue of using additional large-scale out-domain data to improve patent translation systems.
The RWTH Aachen System for NTCIR-9 PatentMT
Minwei Feng, Christoph Schmidt, Joern Wuebker, Stephan Peitz, Markus Freitag and Hermann Ney
[Pdf] [Table of Contents]

This paper describes the statistical machine translation (SMT) systems developed by RWTH Aachen University for the Patent Translation task of the 9th NTCIR Workshop. Both phrase-based and hierarchical SMT systems were trained for the constrained Japanese-English and Chinese-English tasks. Experiments were conducted to compare different training data sets, training methods and optimization criteria as well as additional models for syntax and phrase reordering. Further, for the Chinese-English subtask we applied a system combination technique to create a consensus hypothesis from several different systems.
IBM Chinese-to-English PatentMT System for NTCIR-9
Young-Suk Lee, Bing Xiang, Bing Zhao, Martin Franz, Salim Roukos and Yaser Al-Onaizan
[Pdf] [Table of Contents]

We describe IBM statistical machine translation systems for the NTCIR-9 Chinese-to-English PatentMT evaluation. IBM's primary system combines the translation output of three distinct statistical machine translation systems ? phrase, direct and syntax-based translation systems ? using language model re-scoring on confusion networks. Each translation system differs in terms of translation models and decoding techniques, sharing the same pre-processing, word alignments, post-processing and language models. IBM's Chinese-to-English primary system achieved the second highest BLEU score 36.11 out of all primary systems scored.
Use of the Japio Technical Field Dictionaries for NTCIR-PatentMT
Tadaaki Oshio, Tomoharu Mitsuhashi and Tsuyoshi Kakita
[Pdf] [Table of Contents]

Japio performs various patent-related translation businesses, and owns the original patent-document-derived bilingual technical term database (Japio Terminology Database) to be used by the translators. Currently the database contains more than 1,000,000 J-E technical terms. The Japio Technical Field Dictionaries (technical-field-oriented machine translation dictionaries) are created from the Japio Terminology Database based on each entry’s frequency in the bilingual patent document corpus compiled by Japio. Japio applied the Japio Technical Field Dictionaries to a commercial machine translation engine for the NTCIR9-PatentMT (JE and EJ subtasks).
LIUM’s Statistical Machine Translation System for the NTCIR Chinese/English PatentMT
Holger Schwenk and Sadaf Abdul-Rauf
[Pdf] [Table of Contents]

This paper describes the development of a Chinese-English statistical machine translation system for the 2011 NTCIR patent translation task. We used phrase-based and hierarchical systems based on the Moses decoder, trained on the provided data only. Additional features include translation model adaptation using monolingual data and a continuous space language model. We report comparative results for these various configurations.
Machine translation system for patent documents combining rule-based translation and statistical post-editing applied to the PatentMT Task
Terumasa Ehara
[Pdf] [Table of Contents]

In this article, we describe system architecture, preparation of training data and experimental results of the EIWA group in the NTCIR-9 Patent Translation Task. Our system is combining rule-based machine translation and statistical post-editing. Experimental results for Japanese to English (JE) subtask show 0.3169 BLEU score, 7.8161 NIST score, 0.7404 RIBES score, 3.43 adequacy score and 0.6381 pair wise comparison score for acceptability. Experimental results for Chinese to English (CE) subtask show 0.2597 BLEU score, 7.2282 NIST score, 0.7455 RIBES score, and 3.05 adequacy score.
ZZX_MT: the BeiHang MT System for NTCIR-9 PatentMT Task
WenHan Chao and Zhoujun Li
[Pdf] [Table of Contents]

In this paper, we describe ZZX_MT machine translation system for the NTCIR-9 Patent Machine Translation Task. We participated in the Chinese-English translation subtask and submit three results, which correspond to three different models or decoding algorithms respectively. Both of the first two are phrase-based SMT approaches integrating the BTG constraint into reordering models, and the last one is a hybrid system, which is a SMT system while using an example-based decoder.
ISTIC Statistical Machine Translation System for Patent machine translation in NTCIR-9
Yanqing He, Chongde Shi and Huilin Wang
[Pdf] [Revised Paper Pdf 20111206] [Table of Contents]

This paper describes statistical machine translation system of ISTIC used in the evaluation campaign of the patent translation task at NTCIR-9. In this year's evaluation, we participated in patent translation task for Chinese-English. Here we mainly describe the overview of the system, the primary modules, the key techniques and the evaluation results.
System Description of BJTU-NLP SMT for NTCIR-9 PatentMT
Junjie Jiang, Jinan Xu, Youfang Lin and Yujie Zhang
[Pdf] [Table of Contents]

This paper presents the overview of statistical machine translation systems that BJTU-NLP developed for the NTCIR-9 Patent Machine Translation Task (NTCIR-9 PatentMT). We compared the performance between phrase-based translation model and factored translation model in our Patent SMT of Chinese to English and English to Japanese. Factored translation model was proposed as an extended phrase-based statistical machine translation model. Many languages have shown off it to good effect. However, factored translation model didn’t get a better BLEU score than phrase-based translation model in our experiments.
Learning of Linear Ordering Problems and its Application to J-E Patent Translation in NTCIR-9 PatentMT
Shuhei Kondo, Mamoru Komachi, Yuji Matsumoto, Katsuhito Sudoh, Kevin Duh and Hajime Tsukada
[Pdf] [Table of Contents]

This paper describes the patent translation system submitted for the NTCIR-9 PatentMT task. We applied the Linear Ordering Problem (LOP) based reordering model to Japanese-to-English translation to deal with the substantial difference in the word order between the two languages.
Statistical Machine Translation with Rule based Machine Translation
Jin'ichi Murakami and Masato Tokuhisa
[Pdf] [Table of Contents]

We have evaluated the two-stage machine translation (MT) system. The first stage is a state-of-the-art trial rule-based machine translation system. The second stage is a normal statistical machine translation system. For Japanese-English machine translation, first, we used a Japanese-English rule-based MT, and we obtained "ENGLISH" sentences from Japanese sentences. Second, we used a standard statistical machine translation. This means that we translated "ENGLISH" to English machine translation. This method has an advantages that it produces grammatically correct sentences. From the results of experiments in the JE task, we obtained a BLEU score of 0.1996 using our proposed method. In contrast, we obtained a BLEU score of 0.1436 using a standard method. And for the EJ task, we obtained a BLEU score of 0.2775 using our proposed method. In contrast, we obtained a BLEU score of 0.0831 using a standard method. This means that our proposed method was effective for the JE and EJ task. However, there is a problem. The BLEU score was not so effective to measure the translation quality.
POSTECH’s Statistical Machine Translation Systems for NTCIR-9 PatentMT Task (English-to-Japanese)
Hwidong Na, Jin-Ji Li, Se-Jong Kim and Jong-Hyeok Lee
[Pdf] [Table of Contents]

We present a two-stage statistical machine translation (SMT) framework as proposed in Li et. al. In the first stage, it resolves structural differences using a phrase-based SMT with syntax-aided preprocessing (SMT1). In the second stage, it resolves lexical differences using a phrase-based SMT (SMT2). For morpho-syntactically divergent language pairs such as English-Japanese, this framework strengthens the structural transfer of phrase-based SMT whose capability for lexical transfer has already been well established. Translation from a morphologically-poor language (isolating language) to a morphologically-rich one (agglutinative language) is more difficult than the converse. Our proposed approach fills morpho-syntactic gaps with the transferred syntactic roles. It facilitates the generation of adequate case markers that appear only in the target languages. In addition, we take into consideration word order differences between English and Japanese. Our proposed approach moves modality-bearing words to the end of a sentence as Japanese is a verb-final language. Finally, as they are complementary, we combine the two above-mentioned approaches in a cascaded model to perform a more generalized structural transfer. The input sentences are syntactically reordered, and the thematic divergences of the subject and object relations of the reordered sentences are then resolved, and vice versa (transfer and reorder).
EBMT System of KYOTO Team in PatentMT Task at NTCIR-9
Toshiaki Nakazawa and Sadao Kurohashi
[Pdf] [Table of Contents]

This paper describes “KYOTO” EBMT system that attended PatentMT task at NTCIR-9. When translating very different language pairs such as Japanese-English and Chinese-English, it is very important to handle sentences in tree structures to overcome the difference. Some works incorporate tree structures in some parts of whole translation process, but not all the way from model training (parallel sentence alignment) to decoding. “KYOTO” system is a fully tree-based translation system where we use the Bayesian phrase alignment model on dependency trees and example-based translation.
Statistical Approaches to Patent Translation for PatentMT - Experiments with Various Settings of Training Data
Yuen-Hsien Tseng, Chao-Lin Liu, Chia-Chi Tsai, Jui-Ping Wang, Yi-Hsuan Chuang and James Jeng
[Pdf] [Table of Contents]

This paper describes our experiments and results in the NTCIR-9 Chinese-to-English Patent Translation Task. A series of open source software were integrated to build a statistical machine translation model for the task. Various Chinese segmentation, additional resources, and training corpus preprocessing were then tried based on this model. As a result, more than 20 experiments were conducted to compare the translation performance. Our current results show that 1) consistent segmentation between the training and testing data is important to maintain the performance; 2) sufficient number of good quality bilingual training sentences is more helpful than additional bilingual dictionaries; and 3) the translation effectiveness in BLEU values doubles as the number of bilingual training sentences at the level of 100,000 doubles.
SMT Systems in the University of Tokyo for NTCIR-9 PatentMT
Xianchao Wu, Takuya Matsuzaki and Jun'ichi Tsujii
[Pdf] [Table of Contents]

In this paper, we present two Statistical Machine Translation (SMT) systems and the evaluation results of Tsujii Laboratory in the University of Tokyo (UOTTS) for the NTCIR-9 patent machine translation tasks (PatentMT). This year, we participated in all the three subtasks: bidirectional English-Japanese translations and Chinese-to-English translation. Our first system is a forest-to-string system making use of HPSG forests of source English sentences. We used this system to translate English forests into Japanese. The second system is a re-implementation of a hierarchical phrase based system. We applied this system to all the three subtasks. We describe the training and decoding processes of the two systems and report the translation accuracies of our systems on the official development/test sets.
The ICT’s Patent MT System Description for NTCIR-9
Hao Xiong, Linfeng Song, FanDong Meng, Yajuan Lü and Qun Liu
[Pdf] [Table of Contents]

This paper introduces the ICT’s system for Patent Machine Translation at the NTCIR-9 Workshop. In this year’s program, we participate all the three subtasks: Chinese-English, English-Japanese and Japanese-English. We submit six translation results for each subtask generated by an in-house implemented hierarchical phrase-based system (HPB) with four different variants, a widely used open source system (Moses) as well as a combinational system (SCM), respectively. We employ general translation model and concentrate on developing refined preprocessing and postprocessing techniques for patent translation. Besides that, we attempt to improve the quality of patent translation by chemical expression substitution, incorporating manually written templates, domain adaption and reranking, etc. Experimental results show that our small techniques achieve improvement over baseline, however, compared to other participants, our result is not excellent.
HPB SMT of FRDC Assisted by Paraphrasing for the NTCIR-9 PatentMT
Zhongguang Zheng, Naisheng Ge, Yao Meng and Hao Yu
[Pdf] [Table of Contents]

This paper describes the FRDC machine translation system for the NTCIR-9 PatentMT. The FRDC system JIANZHEN is a hierarchical phrase-based (HPB) translation system. We participated in all the three subtasks, i.e., Chinese to English, Japanese to English and English to Japanese. In this paper, we introduce a novel paraphrasing mechanism to handle a certain kind of Chinese sentences whose syntactic component are far separated. The paraphrasing approach based on the manual templates moves far-separated syntactic components closer so that the translation could become more acceptable. In addition, we single parentheses out for special treatment for all the three languages.
[PatentMT] Summary Report of Team III_CYUT_NTHU
Joseph Chang, Ho-Ching Yen, Shih-Ting Huang, Ming-Jhuan Jiang, Chung-Chi Huang, Jason S. Chang and Ping-Che Yang
[Pdf] [Table of Contents]

In this report paper, we investigate two issues facing phrase-based machine translation (MT) systems such as Moses (Koehn et al., 2007): out-of-vocabulary (OOV) words and singletons. MT systems typically ignore and directly output unknown or OOV source words into the target translation. On the other hand, for words which do not couple with their preceding or following words as phrases, as referred to as singletons, MT systems typically leave their translation disambiguation to language model within whose knowledge is somewhat limited and determined by the preset length of words. In this paper, we first analyze the proportion of OOV words and singletons in translation task, summarize types of OOV words, and manually evaluate the impact of singletons on phrase-based MT systems. We also introduce methods for dealing with there two issues without changing an underlying phrase-based decoder.