21. Tokyo Institute of Technology Multilingual Speech Corpus (TITML)

21-b. Icelandic (TITML-ISL)

Assoc. Prof. Koichi Shinoda and Prof. Sadaoki Furui, Tokyo Institute of Technology

The Icelandic speech corpus was developed for training the acoustic models of an automatic speech recognition system.

This database contains 3 kinds of read speech. These were collected in Iceland.

PB text
- 256 Icelandic bi-phonetically balanced sentences. (alphabets and around 20 spontaneous utterances are also included.)
Weather information queries
- 1000 sentences which was translated from MIT's JUPITER corpus. (660 sentences were actually uttered)
News
- 400 sentences from the news domain.

PB text
- 20 speakers (13 males and 7 females), 1 set (256 sentences) per speaker (one of speakers uttered different 184 sentences)
Weather information queries
- 20 speakers (10 males and 10 females), roughly 200 sentences per speaker
News
- 20 speakers (10 males and 10 females), roughly 20 sentences per speaker

WAV format (16 kHz, 16 bit, Mono)

1 DVD

For research purpose only

No fee

PB text
Hvernig er útlitið á laugardaginn í kring um Hengilinn?

male / female
Questions
ég myndi vilja fá veðurfréttir fyrir eyjar fyrir morgundaginn þrítugasta október

male / female
News
Maðurinn féll þannig á teinana að hann lenti milli sporanna og lifði því af er lestin ók yfir hann

female