Automatic Speech Recognition for Chinese

Red Hen has a preliminary Automatic Speech Recognition pipeline on Chinese. Would you like to help improve it?

If so, write to

and we will try to connect you with a mentor.

Related Scrolls

More Information

Red Hen has a pipeline in production at the Case HPC that runs Chinese ASR using Baidu's DeepSpeech2 with PaddlePaddle inside a Singularity container built on Singularity Hub from a recipe. It starts with this command:

singularity exec -e --nv ../Chinese_Pipeline.simg bash infer.sh $DAY

In the Slurm job submission, it requests a GPU:

#SBATCH -p gpu -C gpuk40 --mem=100gb --gres=gpu:2

abc123@server:~/cp$ squeue -u abc123

JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)

12267389 gpu work.slu abc123 PD 0:00 1 (Priority)

12267373 gpu work.slu abc123 R 27:51 1 gput025

12267379 gpu work.slu abc123 R 15:41 1 gput026

It takes about four minutes to run ASR on a standard one-hour recording.

To Do

Chinese Red Hens report that the output makes sense, but has copious errors and disfluencies; to improve it, the audio should be cut at pauses or in word breaks rather than mechanically at ten-second intervals. A news content training dataset would also help.

Thoughts

Other approaches are also worth exploring, notably Baidu DeepSpeech3.

Page updated

Google Sites

Report abuse

Automatic Speech Recognition for Chinese

Related Scrolls

Related Links

More Information

To Do

Thoughts