Kageura, K., Koyama, T., Yoshioka, M., Takasu, A., Nozue, T., Tsuji, K.. NACSIS corpus project for ir and terminological research. In Natural Language Processing Pacific Rim Symposium 1997, pp. 493-496, Phuket, Thailand, Dec. 1997.
The paper introduces the corpus construction project currently carried out at NACSIS, which aims at providing a common testbed for terminological and IR research. The motivations and backgrounds, the linguistic specifications as well as some technical problems of the corpus are discribed in the paper.
Kando, N., Koyama, T., Oyama, K., Kageura, K., Yoshioka, M., Nozue, T., Matsumura, A., Kuriyama, K.. NTCIR : NACSIS test collection project. In [Poster] the 20th Annual Colloquium of the British Computer Society Information Retrieval Specialist Group, Autrans, France, Mar. 1998.
In this poster we introduce the project constructing NTCIR (NACSIS Test Collection for evaluation of Information Retrieval systems) currently carried out at the National Center for Science Information Systems (NACSIS), Japan . In the following, the backgrounds and the aims of the project are firstly introduced, followed by the specification of the test collection. We then discuss some technical problems concerning the construction of the collection.
Koyama, T., Yoshioka, M., Kageura, K.. The construction of a lexically motivated corpus --- the problem with defining lexical units. In First International Conference on Language Resources and Evaluation, pp. 1015-1019, Granada, Spain, May 1998.
We are currently constructing a special language corpus, with particular emphasis on such lexically oriented studies and applications as terminology and information retrieval. In the paper, we focus on the problem with defining the lexical units in running texts, which is essential for the construction of a lexically motivated corpus. Although we focus on Japanese in this paper, the problem discussed here may be relevant to any other language.
Kageura, K., Yoshioka, M., Koyama, T., Nozue, T., Tsuji, K.. Towards a common testbed for corpus-based computational terminology. In COMPUTERM'98, pp. 81-85, Montreal, Canada, Aug. 1998.
We are currently constructing a textual and terminological corpus with special emphasis on the possible use in corpus-based study of terminology. In this paper we discuss the basic background, motivation and the aim of the project, startingwith the examination of the current state-of-the-art of the automatic term recognition. The basic desiderata of a common testbed are then clarified, followed by a brief introduction of the basic feature of the NACSIS corpus which is intended to offer a common testbed for terminological research.
Nozue, T., Kando, N., Kuriyama, K.. Reconsideration of the concept of relevance for NACSIS test collection (in Japanese). In Proceeding of the 46th Annual Conference of Japan Society of Library and Information Science, pp. 67-70, Nov. 1998.
Kando, N.. Invited Talk: On evaluation of information retrieval systems: Aspects of test collections and competitions (in Japanese). In Informatics Symposium'99, Jan. 1999.
Kando, N., Kuriyama, K., Nozue, T., Oyama, K.. NTCIR-1(NACSIS Test Collection for Information Retrieval system-1): Its policy and practice (in Japanese). In the Special Interest Group Notes of Information Processing Society of Japan, No. 99-FI-53, pp. 33-40, Mar. 1999.
This paper reports the outline of the NTCIR (NACSIS Test Collection for Information Retrieval systems) project, its Test Collection 1 (NTCIR-1), and the competition-typed workshop that uses the NTCIR-1. Based on the previous discussion about the data used for information retrieval laboratory testing, we discuss the fundamental policies of the project. NTCIR-1 contains ca.330,000 documents, more than half are Japanese-English paired. A search topic contains detailed narrative including term definition, relevance judgment criteria, the purpose of the search, and background knowledge, which are thought to be helpful for relevance assessment. The Workshop, which started from November, 1998, obtained thirty-one participating groups. The exhaustivity of the relevance assessment for the training topics is reported. The pooling by interactive searches was effective for particular topics and found 17.5% of unique relevant documents. The internal pooling worked well and covered 97% of the whole relevant documents.
Yoshioka, M., Okada, M., Kageura, K., Koyama, T.. Construction of terminologically-motivated corpus (in Japanese). In the Special Interest Group Notes of Information Processing Society of Japan, No. 98-FI-53, pp. 41-48, Mar. 1999.
Nozue, T., Kando, N.. Primary considerations in the concept of relevance: Relevance judgement of NTCIR (in Japanese). In the Special Interest Group Notes of Information Processing Society of Japan, No. 99-FI-53, pp. 49-56, Mar. 1999.
Kando, N. (organized). New perspective of IR: From test collection to search engine (in Japanese). Open Panel at the 58th Annual Meeting of the Information Processing Society of Japan, Mar. 1999.
[slides(Japanese only)]
Kando, N., Kuriyama, K., Nozue, T.. NACSIS test collection workshop (NTCIR-1). In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA, Aug. 1999.
Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., Hidaka, S.. Overview of IR tasks at the first NTCIR workshop. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 11-44, Tokyo, Japan, Aug. 1999.
Kando, N.. Cross linguistic scholarly information transfer and database services in japan. In 1997 Anuual Meeting of the American Society for Information Science, Panel on Multilingual Databases, Washington D.C., USA, Nov. 1997.
Kuriyama, K.. Query Expansion using Thesauri (in Japanese). In the Special Interest Group Notes of Information Processing Society of Japan, No. 98-FI-52, pp. 1-8, Jan. 1998.
Kando, N., Kageura, K., Yoshioka, M., Oyama, K.. Phrase processing methods for japanese text retrieval. In ACM-SIGIR '98 Workshop on "Information Retrieval : Theroy into Practice, Melbourne, Australia, Aug. 1998.
This paper examines the effectiveness of different phrase identification and weighting methods for Japanese text retrieval in an operational information retrieval (IR) system, called NACSIS-IR. Based on our previous experiments, we used character-based indexing with positional information and word- or phrase-based query processing, which allowed us to implement sophisticated linguistic analysis on large-scale databases while maintaining adequate efficiency. The results of retrieval experiments on a large-scale Japanese test collection showed that the combination of enhanced phrase identification using patterns defined over part-of-speech tags and our algorithm Weight5 made a significant positive contribution to retrieval effectiveness. The paper also discusses indexing and phrase processing of Japanese or East Asian languages.
Phrase processing, Japanese text retrieval, character-based indexing, word-based query segmentation
Aizawa, A., Kando, N., Kageura, K.. A graph-based method for automatic generation of multilingual keyword clusters and its applications. In International Joint Digital Library Workshop '98, Bankok, Thailand, Sep. 1998.
Kando, N., Aizawa, A.. Cross-lingual information retrieval using automatically generated multilingual keyword clusters. In the Proceedings of 3rd International Workshop on Information Retireval with Asian Languages, pp. 86-94, Singapore, Singapore, Oct. 1998.
We propose an approach for cross-lingual information retrieval (CLIR) using the automatically generated multilingual keyword clusters based on graph theory, and show its effectiveness in IR and CLIR. The graph theoretic method has advantages that low-frequency keywords can be maintained for later use in IR. The results of retrieval against NACSIS Test Collection 1 showed that query expansion using the clusters improved the search effectiveness in monolingual retrieval, by 13.2%, 14.2% at the level of "Relevant" and "Partially Relevant", respectively. The search effectiveness of CLIR attained levels of 52.4%, 65.7% of the results for monolingual retrieval with "Relevant", and "Partially Relevant", respectively without any manual interaction during the retrieval. Future studies are also discussed.
Cross-lingual information retrieval, English and Japanese, bilingual keywords clusters, graph theory, Japanese text retrieval, character-based indexing, word-, and phrase-based query segmentation
Kando, N., Aizawa, A., Tsuji, K., Kageura, K., Kuriyama, K.. Cross-language information retrieval and automatic construction of multilingual lexicons. In 1998 Anuual Meeting of the American Society for Information Science, Pittbergh, U.S.A, Oct. 1998.
Kanazawa, T.. An information retrieval method using a relevance-based superimposition model (in Japanese). Master Thesis, University of Tokyo, Feb. 1999.
Kanazawa, T., Takasu, A., Adachi, J.. Document retrieval with relevance-based superimposition model (in Japanese). In the 10th Data Engineering Workshop, the Institute of Electronics, Information and Communication Engineers, No. 5B-5, Mar. 1999.
Fujii, A., Ishikawa, T.. Quest: A cross-language information retrieval system (in Japanese). In Proceedings of the Fifth Annual Meeting of the Association for Natural Language Processing, pp. 353-356, Mar. 1999.
Tsuji, K., Yoshikane, F., Kageura, K.. Acquisition of translational pairs from parallel corpora using dictionary (in Japanese). In Proceedings of the Fifth Annual Meeting of the Association for Natural Language Processing, pp. 402-405, Mar. 1999.
Fujii, A., Ishikawa, T.. Cross-language information retrieval for technical documents. In Proceedings of the Joint ACL SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 29-37, College Park, MD, USA, Jun. 1999.
Kanazawa, T., Takasu, A., Adachi, J.. Query expansion with the relevance-based superimposition model (in japanese). In Proceedings of the Database Workshop 1999, No. 5C-3, Okinawa, Japan, July 1999.
Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., Hidaka, S.. Overview of IR tasks at the first NTCIR workshop. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 11-44, Tokyo, Japan, Aug. 1999.
Ogawa, Y.. NTCIR advisor report (in japanese). In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 45-46, Tokyo, Japan, Aug. 1999.
Chen, A., Gey, F. C., Kishida, K., Jiang, H., Liang, Q.. Comparing multiple methods for Japanese and Japanese-English text retrieval. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 49-58, Tokyo, Japan, Aug. 1999.
Murata, M., Uchimoto, K., Ozaku, H., Isahara, H.. Information retrieval based on stochastic models. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 59-70, Tokyo, Japan, Aug. 1999.
Sato, M., Ito, H., Noguchi, N.. NTCIR experiments at Matsushita: Ad-hoc and CLIR task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 71-81, Tokyo, Japan, Aug. 1999.
Kanazawa, T.. R2D2 at NTCIR: Using the relevance-based superimposition model. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 83-88, Tokyo, Japan, Aug. 1999.
Ozawa, T., Yamamoto, M., Umemura, K., Church, K. W.. Japanese word segmentation using similarity measure for IR. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 89-96, Tokyo, Japan, Aug. 1999.
Vines, P., Wilkinson, R.. Experiments with Japanese information retrieval using mg. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 97-100, Tokyo, Japan, Aug. 1999.
Fujita, S.. Notes on phrasal indexing: JSCB evaluation experiments at NTCIR AD HOC. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 101-108, Tokyo, Japan, Aug. 1999.
Ohira, S., Shirai, K.. Proposal and evaluation of significant words selection method based on AIC. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 109-116, Tokyo, Japan, Aug. 1999.
Matsumura, A., Takasu, A., Adachi, J.. Structured index system at NTCIR1: Information retrieval using dependency relationship between words. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 117-122, Tokyo, Japan, Aug. 1999.
Niwa, Y., Iwayama, M., Hisamitsu, T., Nishioka, S., Takano, A., Sakurai, H., Imaichi, O.. Interactive document search with DualNAVI. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 123-130, Tokyo, Japan, Aug. 1999.
Sawada, R., Umemura, K.. Dynamic programming: A new paradigm for information retrieval. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 131-136, Tokyo, Japan, Aug. 1999.
Sakai, T., Shibazaki, Y., Suzuki, M., Kajiura, M., Manabe, T., Sumita, K.. Cross-language information retrieval for NTCIR at Toshiba. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 137-144, Tokyo, Japan, Aug. 1999.
Lin, C., Lin, W., Bian, G., Chen, H.. Description of the NTU Japanese-English cross-lingual information retrieval system used for NTCIR workshop. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 145-148, Tokyo, Japan, Aug. 1999.
Nakazawa, S., Ochiai, T., Satoh, K., Okumura, A.. Cross language information retrieval based on comparable corpora. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 149-155, Tokyo, Japan, Aug. 1999.
Oard, D. W., Wang, J.. NTCIR CLIR experiments at the University of Maryland. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 157-161, Tokyo, Japan, Aug. 1999.
Fujii, A., Ishikawa, T.. Cross-language information retrieval at ULIS. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 163-169, Tokyo, Japan, Aug. 1999.
Mizobuchi, S., Lee, S., Kawano, F., Kobayashi, T., Komatsu, T., ichi, J. Aoe. Multi-lingual multi-media information retrieval system. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 171-178, Tokyo, Japan, Aug. 1999.
Fukushima, T., Akamine, S.. A character-based indexing and word-based ranking method for Japanese text retrieval. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 179-182, Tokyo, Japan, Aug. 1999.
Umemoto, H., Kuramochi, T., Ishitobi, Y., Tateno, M.. Development of a related document retrieval system and evaluation of the system using NTCIR-1. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 183-185, Tokyo, Japan, Aug. 1999.
Kato, T., Shimada, S., Kumamoto, M., Matsuzawa, K.. Idea-deriving information retrieval system. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 187-193, Tokyo, Japan, Aug. 1999.
Kameda, H., Oomori, N., Kubomura, C., Tanifuji, Y.. An advanced system for information retrieval via key concepts. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 195-201, Tokyo, Japan, Aug. 1999.
Kageura, K., Yoshioka, M., Takeuchi, K., Koyama, T., Tsuji, K., Yoshikane, F., Okada, M.. Overview of TMREC tasks. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 415-416, Tokyo, Japan, Aug. 1999.
Kageura, K., Yoshioka, M., Tsuji, K., Yoshikane, F., Takeuchi, K., Koyama, T.. Evaluation of the term recognition task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 417-434, Tokyo, Japan, Aug. 1999.
Takeuchi, K., Yoshioka, M., Koyama, T., Kageura, K.. Evaluation of the keyword extraction task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 435-436, Tokyo, Japan, Aug. 1999.
Koyama, T., Yoshioka, M., Takeuchi, K., Kageura, K.. Evaluation of the role analysis task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 437-439, Tokyo, Japan, Aug. 1999.
Uchimoto, K., Sekine, S., Murata, M., Ozaku, H., Isahara, H.. Term recognition by using different field corpora. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 443-450, Tokyo, Japan, Aug. 1999.
Nakagawa, H.. Compound noun based system for automatic term recognition task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 451-458, Tokyo, Japan, Aug. 1999.
Morimoto, T., Maeshiro, T., Fujiwara, Y.. Extraction of semantic relationships among terms to construct organized knowledge resources. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 459-465, Tokyo, Japan, Aug. 1999.
Fukushige, Y., Noguchi, N.. NTCIR experiments at Matsushita: TMREC task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 467-474, Tokyo, Japan, Aug. 1999.
Hisamitsu, T., Niwa, Y., Nishioka, S., Sakurai, H., Imaichi, O., Iwayama, M., Takano, A.. Term extraction using a new measure of term representativeness. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 475-481, Tokyo, Japan, Aug. 1999.
Rorvig, M., Smith, M. M., Uemura, A.. The n-gram hypothesis applied to matched sets of visualized Japanese-English technical documents. In Proceedings of the 62nd Annual Meeting of the Amerian Society for Information Science, Washington DC, USA, Nov. 1999.