NTT-Tohoku University Familiarity-Controlled Word Lists 2007 (FW07)

Data DOI

Producer, Project

Tadahisa KONDO and Shigeaki AMANO, NTT Communication Science Laboratories

Yoichi SUZUKI and Shuichi SAKAMOTO, Research Institute of Electrical Communication, Tohoku University


The 1600 spoken words (20 words times 20 sets times 4 ranks) reconstructed from the "NTT - Tohoku University Familiarity-controlled Word Lists (FW03)" and edited under the following 5 conditions:

A part of this data set is contained in the 5 audio CD’s for the convenience of the examination, etc.*1


4 trained Japanese speakers including 2 males and 2 females; each word is spoken once.

Speech file format

WAV format (48 kHz, 16 bit, Mono)

Distribution media

1 DVD and 5 audio CDs


For research and development purposes only


11 429 yen (plus consumption tax for a domestic order)


The programs for sound pressure level transformation and noise superposition are included in the DVD.

*1 Noise data is superposed on the 800 words (20 words times 10 sets times 4 ranks) spoken by a female speaker under 5 conditions (no noise, +3 dB, 0 dB, -3 dB, -6 dB).

*2 The sound pressure level criteria are different for this FW07 and the formerly distributed FW03. FW07 is about 6 dB lower than FW03.

Speech sample for test listening

Word Familiarity 1.0 – 2.5: (アカガネ)

Go to corpora list