21. Tokyo Institute of Technology Multilingual Speech Corpus (TITML)
21-b. Icelandic (TITML-ISL)
Data DOI
https://doi.org/10.32130/src.TITML-ISL
Producer, Project
Assoc. Prof. Koichi Shinoda and Prof. Sadaoki Furui, Tokyo Institute of Technology
Contents
The Icelandic speech corpus was developed for training the acoustic models of an automatic speech recognition system.
This database contains 3 kinds of read speech. These were collected in Iceland.
- PB text
- 256 Icelandic bi-phonetically balanced sentences. (alphabets and around 20 spontaneous utterances are also included.)
- Weather information queries
- 1000 sentences which was translated from MIT's JUPITER corpus. (660 sentences were actually uttered)
- News
- 400 sentences from the news domain.
Speaker
- PB text
- 20 speakers (13 males and 7 females), 1 set (256 sentences) per speaker (one of speakers uttered different 184 sentences)
- Weather information queries
- 20 speakers (10 males and 10 females), roughly 200 sentences per speaker
- News
- 20 speakers (10 males and 10 females), roughly 20 sentences per speaker
Speech file format
WAV format (16 kHz, 16 bit, Mono)
Distribution media
1 DVD
Licensing
For research purpose only
Price
No fee