NTT Infant Speech Database (INFANT)
Producer, ProjectShigeaki AMANO, Kimihisa KONDO, and Kazumi KATO, NTT Communication Science Laboratories
Speech data spoken by Japanese native 5 infants from 3 families are recorded more than one hour per month since their birth till 5 years old; completely spontaneous speech.
- This database includes the following data:
- Speech wave (16 kHz, 16 bit, Mono)
- Transcribed text (Chinese character and Japanese Kana alphabet, Katakana alphabet)
- Utterance attributes (Speaker gender, utterance environment, speech volume, etc.)
- Utterance time information
- Comments such as paralinguistic information, etc.
- Voiced/unvoiced flag
- Fundamental frequency (F0) information
- Phoneme labels
Japanese native speakers including the parents and their children (2 boys and 3 girls of 3 families).
|Speaker ID||Age (months)||Period (months)||Time (hours)||Repetitions||Frequency|
|A||0–30||25||161||316||≥ 1 hour/month|
|B||0–54||50||140||720||≥ 1 hour/month|
|E||0–59||50||106||691||≥ 1 hour/month|
Speech file format
WAV format (16 kHz, 16 bit, Mono)
For research and development purposes only
11 880 yen including consumption tax.
Speech sample for test listening
Girl / Recording period: 0–60 months
- at the age of 0 month
- at the age of 12 months
- at the age of 24 months
- at the age of 36 months
- at the age of 48 months
- at the age of 59 months