36. Hiroshima City University Japanese Emotional Speech Corpus (HCUDB)

Data DOI

Producer, Project

Dr. Kazuya Mera, Hiroshima City University

The Center of Innovation (COI) Proguram, Japan Science and Technology Agency (JST)

Grant-in-Aid for Scientific Research (C) "Mental state estimation method from conflict between text and tone of the voice" supported by JSPS, Japan during FY 2014-2016.

This corpus consists of utterances of professional speakers (narrator, actor, etc.) acting the same lines with multiple emotions.

HCUDB1
14 speakers uttered three takes of each of ten different lines with eleven different emotions such as "surprise," "anger," "contempt," and "sleepy/tired" based on Russell's Circumplex Model. HCUDB1 includes 4,620 utterances in total.
In addition, evaluations of speaker emotion were conducted on all utterances by 16 other people (three types of evaluations: level of valence and arousal, applicability of each eleven emotions, and naturalness).
HCUDB2
20 speakers uttered three takes of five of the ten lines of HCUDB1, each with eleven different emotions as in HCUDB1, using two different acting techniques (technical or affective). HCUDB2 includes 6,600 utterances in total. The evaluations by others were not assigned.

Speaker

HCUDB1 : 14 professional speakers (6 males and 8 females), 20s to 60s

HCUDB2 : 20 professional speakers (7 males and 13 females), 20s to 60s

Speech file format

WAV format (48 kHz, 16 bit, Mono)

Distribution media

1 DVD

Licensing

For research purpose only

Price

No fee

Note

All documents are written in Japanese.

Speech sample for test listening

HCUDB2

50s male announcer, "Sounan desu ka (Is that so?)", technical performance

Surprise / Anger / Contempt / Sleepy/Tired

50s male announcer, "Sounan desu ka (Is that so?)", affective performance