21. Tokyo Institute of Technology Multilingual Speech Corpus (TITML)
21-a. Indonesian (TITML-IDN)
Data DOI
Producer, Project
Assoc. Prof. Koichi Shinoda and Prof. Sadaoki Furui, Tokyo Institute of Technology
The Indonesian Phonetically Balanced Speech Corpus was developed for training the acoustic models of an automatic speech recognition system.
This database contains Bahasa Indonesia speech data from 20 Indonesian speakers. Each speaker was asked to read 343 phonetically balanced sentences most of which selected from a text corpus.
20 speakers (11 males and 9 females)
343 file per speaker
Speech file format
WAV format (16 kHz, 16 bit, Mono)
Distribution media
For research purpose only
No fee
Speech sample for test listening
The Indonesian phonetically balanced sentences selected from a text corpus.
maaf saya terlambat datang ke kantor.
pemerintah menggunakan beberapa referensi diantaranya dampak ekonomi setelah serangan teroris di luxor mesir pada bulan november.