36. Hiroshima City University Japanese Emotional Speech Corpus (HCUDB)
Data DOI
Producer, Project
Dr. Kazuya Mera, Hiroshima City University
The Center of Innovation (COI) Proguram, Japan Science and Technology Agency (JST)
Grant-in-Aid for Scientific Research (C) "Mental state estimation method from conflict between text and tone of the voice" supported by JSPS, Japan during FY 2014-2016.
This corpus consists of utterances of professional speakers (narrator, actor, etc.) acting the same lines with multiple emotions.
14 speakers uttered three takes of each of ten different lines with eleven different emotions such as "surprise," "anger," "contempt," and "sleepy/tired" based on Russell's Circumplex Model. HCUDB1 includes 4,620 utterances in total.
In addition, evaluations of speaker emotion were conducted on all utterances by 16 other people (three types of evaluations: level of valence and arousal, applicability of each eleven emotions, and naturalness). - HCUDB2
20 speakers uttered three takes of five of the ten lines of HCUDB1, each with eleven different emotions as in HCUDB1, using two different acting techniques (technical or affective). HCUDB2 includes 6,600 utterances in total. The evaluations by others were not assigned.
HCUDB1 : 14 professional speakers (6 males and 8 females), 20s to 60s
HCUDB2 : 20 professional speakers (7 males and 13 females), 20s to 60s
Speech file format
WAV format (48 kHz, 16 bit, Mono)
Distribution media
For research purpose only
No fee
All documents are written in Japanese.
Speech sample for test listening
50s male announcer, "Sounan desu ka (Is that so?)", technical performance
Surprise / Anger / Contempt / Sleepy/Tired
50s male announcer, "Sounan desu ka (Is that so?)", affective performance
Surprise / Anger / Contempt / Sleepy/Tired