18. Chinese MULTEXT Corpus (MULTEXT-C)

Producer, Project

Assoc. Prof. Masahiko Komatsu, Health Sciences University of Hokkaido (now, Kanagawa University)


The Chinese version of Multilingual Text Tools and Corpora (MULTEXT).

The speakers were asked to read aloud the 40 passages (each passage includes 5 - 6 sentences) as naturally as possible.


10 native speakers of Chinese (5 males and 5 females)

1 speaker read all 40 passages, and each of the other 9 speakers read 15 passages (each passage was read by 4 or 5 speakers).

Recording environment

Soundproof room

Speech file format

WAV format (22 050 Hz, 16 bit, Mono)

Distribution media



For research purpose only


No fee

Speech sample for test listening

Chinese texts translated from the MULTEXT corpus and the Japanese MULTEXT corpus


Go to corpora list