Lists of phonetically-balanced words and sentences of Japanese

This contains several lists of phonetically-balanced words and sentences, developed by various researchers, to be used for speech research. They were developed to obtain phonetically balanced words or sentences with as small number of words or sentences as possible.

  1. ATR 503 sentences
    • They include 402 two-phoneme sequences, 223 three-phoneme sequences, or 625 items in total. Phonetically balanced sentences were extracted from newspapers, journals, novels, letters, text books, among others, in a way that several phonetic environments occur at the same rate as much as possible.
    • References
      • K. Iso, T. Watanabe, H. Kuwabara, "Design of a Japanese Sentence List for a Speech Database," Preprints, Spring Meeting of Acous. Soc. Jpn., Paper 2-2-19, pp. 89-90 (Mar. 1988). (in Japanese)
      • A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, K. Shikano, "ATR Japanese Speech Database as a Tool of Speech Recognition and Synthesis," Speech Communication, 9, 357-363 (1990).
  2. ATR 216 words
    • They include all possible two-phoneme sequences, composed of at least three moras.
    • Reference
      • H. Kuwabara, K. Takeda, Y. Sagisaka, S. Katagiri, S. Morikawa, T. Watanabe, "Construction of a Large-Scale Japanese Speech Database and its Management System," Proc. ICASSP-89, Paper S10b.12, pp. 560-563 (1989).
  3. ETL 492/1542 words
    • They contain 3 phoneme sequences such as VCV with as small number of words as possible. The set of 1542 words includes all 492 words and also includes 3 phoneme sequences including a geminate consonant.
    • Reference
      • K. Tanaka, S. Hayamizu, K. Ohta, "The ETL speech database for speech analysis and recognition research," Proc. ICSLP-90, Kobe, Japan, Paper 24.7, pp. 1101-1104 (1990).
  4. Tohoku University – Panasonic 212 words
    • Each item is selected so that all possible phonemic environments occur.
    • Reference
      • S. Makino, T. Shirokaze, K. Kido, "A distributed speech database with an automatic acquisition system of speech information," Proc. ICSLP-90, Kobe, Japan, Paper 24.9, pp. 1109-1112 (Dec. 1992).
  5. Tohoku University - Panasonic 3285 words
    • Each item is chosen from the station names or line names of Japan Railway Companies.
    • Reference
      • K. Akiba, T. Irumano, H. Kanasashi, Y. Mafune, "Speech Database for Spoken Japanese Recognition Research," Preprints, IECE Convention, Paper 1391, page 5-376 (1982). (in Japanese)
  6. JEITA list (based on Umeda List)
    • It includes 1350 words of phonetically-balanced 2-syllable and 3-syllable words.
    • References
      • JEITA Standard, "Speech Synthesis System Performance Evaluation Methods," JEITA IT-4001, Technical Standardization Committee on Speech Input/Output Systems, Japan Electronics and Information Technology Industries Association (Feb. 2003). (in Japanese)
      • N. Torii, "Phonetically balanced word list of Japanese," Electrical Communications Laboratories, Nippon Telegraph and Telephone Corporation, Report No. 484 (1956). (in Japanese)
      • T. Watanabe, H. Nagabuchi, N. Kitawaki, "A word-selecting method for intelligibility assessment of synthesized speech by rule," Trans. IEICE, Vol. J71-A, No. 3, pp. 616-623 (Mar. 1988). (in Japanese)
  7. Nagoya Institute of Technology (NIT) English sentences
    • They consist of 5 genres such as novels, news, conversations, PCS and SUS, each genre contains 50 sentences, used for the speech synthesis contest Blizzard Challenge speech synthesis contest.
      PCS:
      Phonetically confusable sentences.
      SUS:
      Semantically unpredictable sentences.
    • Reference
      • A. Black, K. Tokuda, "The Blizzard Challenge- 2005: Evaluating corpus-based speech synthesis on common datasets," Proc. Interspeech 2005, pp. 77-80, Lisbon, Portugal (Sep. 4-8, 2005).
  8. NIT Chinese sentences
    • They were devised in the same way as the English version to be used for the Blizzard Challenge speech synthesis contest.