o NTCIRについての発表
o NTCIR-1(テスト版)を使った研究
o NTCIR-1(予備版)を使った研究

o Kageura, K., Koyama, T., Yoshioka, M., Takasu, A., Nozue, T., Tsuji, K.. NACSIS corpus project for ir and terminological research. In Natural Language Processing Pacific Rim Symposium 1997, pp. 493-496, Phuket, Thailand, Dec. 1997.

The paper introduces the corpus construction project currently carried out at NACSIS, which aims at providing a common testbed for terminological and IR research. The motivations and backgrounds, the linguistic specifications as well as some technical problems of the corpus are discribed in the paper.

o Kando, N., Koyama, T., Oyama, K., Kageura, K., Yoshioka, M., Nozue, T., Matsumura, A., Kuriyama, K.. NTCIR : NACSIS test collection project. In [Poster] the 20th Annual Colloquium of the British Computer Society Information Retrieval Specialist Group, Autrans, France, Mar. 1998.

In this poster we introduce the project constructing NTCIR (NACSIS Test Collection for evaluation of Information Retrieval systems) currently carried out at the National Center for Science Information Systems (NACSIS), Japan . In the following, the backgrounds and the aims of the project are firstly introduced, followed by the specification of the test collection. We then discuss some technical problems concerning the construction of the collection.

o Koyama, T., Yoshioka, M., Kageura, K.. The construction of a lexically motivated corpus --- the problem with defining lexical units. In First International Conference on Language Resources and Evaluation, pp. 1015-1019, Granada, Spain, May 1998.

We are currently constructing a special language corpus, with particular emphasis on such lexically oriented studies and applications as terminology and information retrieval. In the paper, we focus on the problem with defining the lexical units in running texts, which is essential for the construction of a lexically motivated corpus. Although we focus on Japanese in this paper, the problem discussed here may be relevant to any other language.

o Kageura, K., Yoshioka, M., Koyama, T., Nozue, T., Tsuji, K.. Towards a common testbed for corpus-based computational terminology. In COMPUTERM'98, pp. 81-85, Montreal, Canada, Aug. 1998.

We are currently constructing a textual and terminological corpus with special emphasis on the possible use in corpus-based study of terminology. In this paper we discuss the basic background, motivation and the aim of the project, startingwith the examination of the current state-of-the-art of the automatic term recognition. The basic desiderata of a common testbed are then clarified, followed by a brief introduction of the basic feature of the NACSIS corpus which is intended to offer a common testbed for terminological research.

o 野末俊比古, 神門典子, 栗山和子. レレバンス概念再考: NACSIS テストコレクションのための試論. 第 46 回 日本図書館情報学会研究大会発表要綱, pp. 67-70, Nov. 1998.

o 神門典子. 招待講演: 情報検索システムの評価を巡って: テストコレクションとコンペティションを中心に. 1999 年情報学シンポジウム, Jan. 1999.

評価は、どのような科学技術分野でも重要な課題である。情報検索システムの 評価では、情報検索研究の初期から、評価尺度として再現率と精度を用い、評 価実験用のテストコレクションを用いてきた。研究コミュニティで共通に利用 できる標準的テストコレクションは、検索実験の実施を容易にし、システム間 の比較と評価の標準化を促進することによって、情報検索研究の発展に貢献し てきた。その一方で、テストコレクションの妥当性と有用性に関して多くの議 論が繰り返されてきた。近年、我が国においても、情報検索技術への関心が高 まり、日本語の標準的テストコレクション構築の動きが盛んである。そこで、 本稿では、テストコレクションを中心に、情報検索における主要な研究室型評 価実験研究を概観し、我が国におけるテストコレクションに関連した動きとし てBMIR,IREX,NTCIR(NACSISテストコレクション)を紹介する。最後に、情報 検索システム評価の課題を展望する。

o 神門典子, 栗山和子, 野末俊比古, 大山敬三. NTCIR-1: 情報検索システム評価用テストコレクション構築の方針と実際. 情報処理学会研究報告, No. 99-FI-53, pp. 33-40, Mar. 1999.

日本語情報検索システム評価用テストコレクション(NTCIR)プロジェクトと現在 構築中のテストコレクション1(NTCIR-1)、そのデータを用いたコンペティショ ン型ワークショップの概要を報告する。テストコレクションに関する議論を踏ま えて、NTCIRの基本方針を示した。NTICR-1の検索対象文書は、約33万件で、半数 以上は日英の対訳である。検索課題は、判定基準、検索の目的、背景知識などの 詳細な検索要求説明を含む。ワークショップは、1998年11月から1999年9月まで 開催され、国内外の31チームが参加している。予備テストの結果で訓練用検索課 題の正解判定の網羅性を評価したところ、対話型検索によるプーリングは特定の 検索課題では特に有効で17.5%のユニークな正解文書を発見した。内部プーリン グは全正解文書の97%をカバーし、概ね良好であった。

o 吉岡真治, 岡田真穂, 影浦峡, 小山照夫. 専門用語抽出・解析処理を考慮したコーパスの作成. 情報処理学会研究報告, No. 98-FI-53, pp. 41-48, Mar. 1999.

学術情報センター(NACSIS)では、NACSISテストコレクションの一貫として、 専門用語抽出・解析処理を特に考慮したNACSISコーパスを作成している。この コーパスでは、専門用語に関する処理の中心である語彙的処理を考慮する必要 があり、語彙単位の明確化が重要な課題となる。そこで、本稿では、これらの 問題に対応するためのNACSISコーパスの設計指針について述べる。さらに本コ ーパスの作成状況と、その公開に関する情報について述べる。

o 野末俊比古, 神門典子. レレバンスをめぐる一考察 --- NTCIR の背景として. 情報処理学会研究報告, No. 99-FI-53, pp. 49-56, Mar. 1999.

o 神門 典子 (企画). 第 58 回情報処理学会全国大会公開パネル 4 --- 情報検索の新たな展開: テストコレクションからサーチエンジンまで, Mar. 1999. [slides]

o Kando, N., Kuriyama, K., Nozue, T.. NACSIS test collection workshop (NTCIR-1). In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA, Aug. 1999.

o Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., Hidaka, S.. Overview of IR tasks at the first NTCIR workshop. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 11-44, Tokyo, Japan, Aug. 1999.

o 吉岡真治,原口誠 大久保好章,適合的汎化に基づく情報検索システムの研究(第1報)−検索語が持つ適合性判定への寄与度の利用−  情報処理学会情報学基礎研究会, 2002-FI-67",  東京,2002年5月21日,22日,p.151-158,


o Kando, N.. Cross linguistic scholarly information transfer and database services in japan. In 1997 Anuual Meeting of the American Society for Information Science, Panel on Multilingual Databases, Washington D.C., USA, Nov. 1997.

o 栗山和子. シソーラスを用いた検索式拡張の評価. 情報処理学会研究報告, No. 98-FI-52, pp. 1-8, Jan. 1998.

o Kando, N., Kageura, K., Yoshioka, M., Oyama, K.. Phrase processing methods for japanese text retrieval. In ACM-SIGIR '98 Workshop on "Information Retrieval : Theroy into Practice, Melbourne, Australia, Aug. 1998.

This paper examines the effectiveness of different phrase identification and weighting methods for Japanese text retrieval in an operational information retrieval (IR) system, called NACSIS-IR. Based on our previous experiments, we used character-based indexing with positional information and word- or phrase-based query processing, which allowed us to implement sophisticated linguistic analysis on large-scale databases while maintaining adequate efficiency. The results of retrieval experiments on a large-scale Japanese test collection showed that the combination of enhanced phrase identification using patterns defined over part-of-speech tags and our algorithm Weight5 made a significant positive contribution to retrieval effectiveness. The paper also discusses indexing and phrase processing of Japanese or East Asian languages.
Phrase processing, Japanese text retrieval, character-based indexing, word-based query segmentation

o Aizawa, A., Kando, N., Kageura, K.. A graph-based method for automatic generation of multilingual keyword clusters and its applications. In International Joint Digital Library Workshop '98, Bankok, Thailand, Sep. 1998.

o Kando, N., Aizawa, A.. Cross-lingual information retrieval using automatically generated multilingual keyword clusters. In the Proceedings of 3rd International Workshop on Information Retireval with Asian Languages, pp. 86-94, Singapore, Singapore, Oct. 1998.

We propose an approach for cross-lingual information retrieval (CLIR) using the automatically generated multilingual keyword clusters based on graph theory, and show its effectiveness in IR and CLIR. The graph theoretic method has advantages that low-frequency keywords can be maintained for later use in IR. The results of retrieval against NACSIS Test Collection 1 showed that query expansion using the clusters improved the search effectiveness in monolingual retrieval, by 13.2%, 14.2% at the level of "Relevant" and "Partially Relevant", respectively. The search effectiveness of CLIR attained levels of 52.4%, 65.7% of the results for monolingual retrieval with "Relevant", and "Partially Relevant", respectively without any manual interaction during the retrieval. Future studies are also discussed.
Cross-lingual information retrieval, English and Japanese, bilingual keywords clusters, graph theory, Japanese text retrieval, character-based indexing, word-, and phrase-based query segmentation

o Kando, N., Aizawa, A., Tsuji, K., Kageura, K., Kuriyama, K.. Cross-language information retrieval and automatic construction of multilingual lexicons. In 1998 Anuual Meeting of the American Society for Information Science, Pittbergh, U.S.A, Oct. 1998.

o 金沢輝一. 関連性の重ね合わせモデルを用いた情報検索手法に関する研究. 東京大学修士論文, Feb. 1999.

o 金沢輝一, 高須淳宏, 安達淳. 関連性の重ね合わせモデルによる文書検索. 電子情報通信学会 第 10 回 データ工学ワークショップ, No. 5B-5, Mar. 1999.

o 小澤智裕, 山本幹雄, 山本英子, 梅村恭司. 情報検索の類似尺度を用いた検索要求文の単語分割. 言語処理学会第 5 回年次大会発表論文集, pp. 305-308, Mar. 1999.

o 藤井敦, 石川徹也. 言語横断検索システム Quest. 言語処理学会第 5 回年次大会発表論文集, pp. 353-356, Mar. 1999.

o 辻慶太, 芳鐘冬樹, 影浦峡. 対訳コーパスからの訳語対抽出における辞書情報の利用について. 言語処理学会第 5 回年次大会発表論文集, pp. 402-405, Mar. 1999.

専門用語の語構成要素の訳語対(以下「対訳要素対」)は、コーパス中におい てその周辺に、専門用語辞書が挙げる対訳要素対を持つ場合が多いことを、人 工知能分野の日英抄録を用いてまず確認した。次に同抄録から、統計的尺度だ けで対訳要素対を抽出する場合と、専門用語辞書が挙げる対訳要素対を近くに 持つ対だけに絞り込む場合とで抽出結果を比較し、絞り込みが再現率をあまり 落とさずに、精度を高めることを確認した。

o Fujii, A., Ishikawa, T.. Cross-language information retrieval for technical documents. In Proceedings of the Joint ACL SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 29-37, College Park, MD, USA, Jun. 1999.

o 金澤輝一, 高須淳宏, 安達淳. 関連性の重ね合わせモデルに基づく問い合わせ表現の拡張. 夏のデータベースワークショップ1999, No. 5C-3, 沖縄, July 1999.

o Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., Hidaka, S.. Overview of IR tasks at the first NTCIR workshop. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 11-44, Tokyo, Japan, Aug. 1999.

o Ogawa, Y.. NTCIR advisor report (in japanese). In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 45-46, Tokyo, Japan, Aug. 1999.

o Chen, A., Gey, F. C., Kishida, K., Jiang, H., Liang, Q.. Comparing multiple methods for Japanese and Japanese-English text retrieval. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 49-58, Tokyo, Japan, Aug. 1999.

o Murata, M., Uchimoto, K., Ozaku, H., Isahara, H.. Information retrieval based on stochastic models. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 59-70, Tokyo, Japan, Aug. 1999.

o Sato, M., Ito, H., Noguchi, N.. NTCIR experiments at Matsushita: Ad-hoc and CLIR task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 71-81, Tokyo, Japan, Aug. 1999.

o Kanazawa, T.. R2D2 at NTCIR: Using the relevance-based superimposition model. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 83-88, Tokyo, Japan, Aug. 1999.

o Ozawa, T., Yamamoto, M., Umemura, K., Church, K. W.. Japanese word segmentation using similarity measure for IR. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 89-96, Tokyo, Japan, Aug. 1999.

o Vines, P., Wilkinson, R.. Experiments with Japanese information retrieval using mg. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 97-100, Tokyo, Japan, Aug. 1999.

o Fujita, S.. Notes on phrasal indexing: JSCB evaluation experiments at NTCIR AD HOC. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 101-108, Tokyo, Japan, Aug. 1999.

o Ohira, S., Shirai, K.. Proposal and evaluation of significant words selection method based on AIC. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 109-116, Tokyo, Japan, Aug. 1999.

o Matsumura, A., Takasu, A., Adachi, J.. Structured index system at NTCIR1: Information retrieval using dependency relationship between words. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 117-122, Tokyo, Japan, Aug. 1999.

o Niwa, Y., Iwayama, M., Hisamitsu, T., Nishioka, S., Takano, A., Sakurai, H., Imaichi, O.. Interactive document search with DualNAVI. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 123-130, Tokyo, Japan, Aug. 1999.

o Sawada, R., Umemura, K.. Dynamic programming: A new paradigm for information retrieval. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 131-136, Tokyo, Japan, Aug. 1999.

o Sakai, T., Shibazaki, Y., Suzuki, M., Kajiura, M., Manabe, T., Sumita, K.. Cross-language information retrieval for NTCIR at Toshiba. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 137-144, Tokyo, Japan, Aug. 1999.

o Lin, C., Lin, W., Bian, G., Chen, H.. Description of the NTU Japanese-English cross-lingual information retrieval system used for NTCIR workshop. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 145-148, Tokyo, Japan, Aug. 1999.

o Nakazawa, S., Ochiai, T., Satoh, K., Okumura, A.. Cross language information retrieval based on comparable corpora. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 149-155, Tokyo, Japan, Aug. 1999.

o Oard, D. W., Wang, J.. NTCIR CLIR experiments at the University of Maryland. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 157-161, Tokyo, Japan, Aug. 1999.

o Fujii, A., Ishikawa, T.. Cross-language information retrieval at ULIS. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 163-169, Tokyo, Japan, Aug. 1999.

o Mizobuchi, S., Lee, S., Kawano, F., Kobayashi, T., Komatsu, T., ichi, J. Aoe. Multi-lingual multi-media information retrieval system. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 171-178, Tokyo, Japan, Aug. 1999.

o Fukushima, T., Akamine, S.. A character-based indexing and word-based ranking method for Japanese text retrieval. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 179-182, Tokyo, Japan, Aug. 1999.

o Umemoto, H., Kuramochi, T., Ishitobi, Y., Tateno, M.. Development of a related document retrieval system and evaluation of the system using NTCIR-1. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 183-185, Tokyo, Japan, Aug. 1999.

o Kato, T., Shimada, S., Kumamoto, M., Matsuzawa, K.. Idea-deriving information retrieval system. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 187-193, Tokyo, Japan, Aug. 1999.

o Kameda, H., Oomori, N., Kubomura, C., Tanifuji, Y.. An advanced system for information retrieval via key concepts. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 195-201, Tokyo, Japan, Aug. 1999.

o Kageura, K., Yoshioka, M., Takeuchi, K., Koyama, T., Tsuji, K., Yoshikane, F., Okada, M.. Overview of TMREC tasks. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 415-416, Tokyo, Japan, Aug. 1999.

o Kageura, K., Yoshioka, M., Tsuji, K., Yoshikane, F., Takeuchi, K., Koyama, T.. Evaluation of the term recognition task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 417-434, Tokyo, Japan, Aug. 1999.

o Takeuchi, K., Yoshioka, M., Koyama, T., Kageura, K.. Evaluation of the keyword extraction task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 435-436, Tokyo, Japan, Aug. 1999.

o Koyama, T., Yoshioka, M., Takeuchi, K., Kageura, K.. Evaluation of the role analysis task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 437-439, Tokyo, Japan, Aug. 1999.

o Uchimoto, K., Sekine, S., Murata, M., Ozaku, H., Isahara, H.. Term recognition by using different field corpora. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 443-450, Tokyo, Japan, Aug. 1999.

o Nakagawa, H.. Compound noun based system for automatic term recognition task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 451-458, Tokyo, Japan, Aug. 1999.

o Morimoto, T., Maeshiro, T., Fujiwara, Y.. Extraction of semantic relationships among terms to construct organized knowledge resources. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 459-465, Tokyo, Japan, Aug. 1999.

o Fukushige, Y., Noguchi, N.. NTCIR experiments at Matsushita: TMREC task. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 467-474, Tokyo, Japan, Aug. 1999.

o Hisamitsu, T., Niwa, Y., Nishioka, S., Sakurai, H., Imaichi, O., Iwayama, M., Takano, A.. Term extraction using a new measure of term representativeness. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp. 475-481, Tokyo, Japan, Aug. 1999.

o Rorvig, M., Smith, M. M., Uemura, A.. The n-gram hypothesis applied to matched sets of visualized Japanese-English technical documents. In Proceedings of the 62nd Annual Meeting of the Amerian Society for Information Science, Washington DC, USA, Nov. 1999.

