34. Corpus of Connecting Nihongo Utterance and Text (Coco-Nut)

Data DOI

Producer, Project

Aya Watanabe and Prof. Shinnosuke Takamichi, The University of Tokyo

This corpus consists of Japanese speech, their transcriptions, and their characteristics prompts (free-form descriptions that express characteristics of speech).

The speech was gathered from the YouTube as 24kHz mp3 and converted into 44.1kHz wav. This corpus contains 7,330, 8-hour (in total) speech. The characteristics prompts were collected through crowdsourcing. The number of prompts is 1 per utterance in training data, and 5 per utterance in validation/test data.

The characteristics prompts are provided in the creator's github repository. NII-SRC provides the speech data and their transcriptions.

Speaker

7,330 speakers in total

Speech file format

WAV format (44.1 kHz, 16 bit, Stereo)

Distribution media

1 DVD(DL)

Licensing

For research purpose only

Price

No fee

Further information

https://sites.google.com/site/shinnosuketakamichi/research-topics/coconut_corpus

Speech sample for test listening

「幸せの定義とは一体なんだ？」
　　Characteristics prompt: 30代くらいの男性の声。ゆっくりと穏やかな話し方でした。苦悩に満ちた、けだるそうな声でした。

「で、お散歩しまーす。食べたらもう散歩しないとわたしたちはもう」
　　Characteristics prompt: 明るい中年の女性がはきはきとした声で楽しそうに喋っている。

「小さな、白い雲が、浮かんでいます」
　　Characteristics prompt: 穏やかそうな若い男性が、とてもゆっくりとした優しい声で静かに語りかけている。

Speech Resources Consortium

(NII-SRC)