29. Japanese Multi-speaker Audiobook Corpus (J-MAC)
Data DOI
https://doi.org/10.32130/src.J-MAC
Producer, Project
Assist. Prof. Shinnosuke Takamichi, The University of Tokyo
Contents
This corpus consists of time-mapped texts for commercial audiobooks.
The total of 74 audiobooks (24 novels) were selected from a large number of commercial ones for speech synthesis research.
*NOTE* Audio data is not included in this corpus, and users must purchase the audio data.
Speaker
39 professional speakers
Speech file format
(Speech files are not included)
Distribution media
1 CD
Licensing
For research purpose only
Price
No fee
Further information
https://sites.google.com/site/shinnosuketakamichi/research-topics/j-mac_corpus
Sample data
「セロ弾きのゴーシュ」(作・宮沢賢治):
chapt000:
parag016:
style000:
- character: narrative
sent: ゴーシュの畑からとった、半分熟したトマトを、さも重そうに持って来て、ゴーシュの前におろして云いました。
time: [383.12, 391.795]
to whom: narrative
style001:
- character: 猫
sent: 「ああくたびれた。
time: [392.25, 394.395]
to whom: ゴーシュ
- character: 猫
sent: なかなか、[運搬|うんぱん]はひどいやな。」
time: [394.42, 397.575]
to whom: ゴーシュ
style002:
- character: ゴーシュ
sent: 「[何|なん]だと」
time: [397.6, 398.715]
to whom: 猫
style003:
- character: narrative
sent: ゴーシュがききました。
time: [398.74, 400.575]
to whom: narrative