36. Hiroshima City University Japanese Emotional Speech Corpus (HCUDB)

Data DOI


Producer, Project

Dr. Kazuya Mera, Hiroshima City University

The Center of Innovation (COI) Proguram, Japan Science and Technology Agency (JST)

Grant-in-Aid for Scientific Research (C) "Mental state estimation method from conflict between text and tone of the voice" supported by JSPS, Japan during FY 2014-2016.


This corpus consists of utterances of professional speakers (narrator, actor, etc.) acting the same lines with multiple emotions.

  1. HCUDB1

    14 speakers uttered three takes of each of ten different lines with eleven different emotions such as "surprise," "anger," "contempt," and "sleepy/tired" based on Russell's Circumplex Model. HCUDB1 includes 4,620 utterances in total.
    In addition, evaluations of speaker emotion were conducted on all utterances by 16 other people (three types of evaluations: level of valence and arousal, applicability of each eleven emotions, and naturalness).

  2. HCUDB2

    20 speakers uttered three takes of five of the ten lines of HCUDB1, each with eleven different emotions as in HCUDB1, using two different acting techniques (technical or affective). HCUDB2 includes 6,600 utterances in total. The evaluations by others were not assigned.


HCUDB1 : 14 professional speakers (6 males and 8 females), 20s to 60s

HCUDB2 : 20 professional speakers (7 males and 13 females), 20s to 60s

Speech file format

WAV format (48 kHz, 16 bit, Mono)

Distribution media



For research purpose only


No fee


All documents are written in Japanese.

Speech sample for test listening


50s male announcer, "Sounan desu ka (Is that so?)", technical performance

50s male announcer, "Sounan desu ka (Is that so?)", affective performance

Go to corpora list