17. Priority Areas "Prosody and Speech Processing" Japanese MULTEXT Prosodic Corpus (MULTEXT-J)

Data DOI

Producer, Project

Grant-in-Aid for Scientific Research on Priority Areas funded by MEXT, Japan during 2000-2004; "Realization of advanced spoken language information processing from prosodic features"

Principal Investigator: Keikichi Hirose (The University of Tokyo)

Corpus Group: Shigeyoshi Kitazawa (Shizuoka University)

The Japanese version of Multilingual Text Tools and Corpora (MULTEXT).

The speakers were asked to read aloud the 40 passages (each passage includes 5 - 6 sentences) in following two speaking styles.

Reading-style
Spontaneous-style (instructed to perform with different emotional attitudes according to the text of each situation)

Speaker

6 speakers (3 males and 3 females)

Recording environment

Anechoic room

Speech file format

WAV format (16 kHz, 16 bit, Mono)

Distribution media

1 DVD

Licensing

For research purpose only

Price

No fee

Supplements

Phoneme label
Prosodic unit label
Accent type and accent kernel label
J-ToBI prosodic label
F0 data
Spline approximated F0 data
EGG signal data
Analyzed EGG data

Speech sample for test listening

Japanese texts translated from the MULTEXT corpus

「今ロンドンに着いたところですが、私の荷物はローマに行ってしまいました。私は糖尿病なので、明日にはどうしてもその荷物が必要です。早速に、荷物の所在を調べてもらえるよう、責任者の方に頼んで下さい。その間、当座の薬が必要です。どこか病院に連絡を取ってもらえませんか。」

Reading-style / Spontaneous-style

Speech Resources Consortium

(NII-SRC)