Red Hen has a preliminary Automatic Speech Recognition pipeline on Chinese. Would you like to help improve it? and we will try to connect you with a mentor. Related ScrollsRelated Links
More InformationRed Hen has a pipeline in production at the Case HPC that runs Chinese ASR using Baidu's DeepSpeech2 with PaddlePaddle inside a Singularity container built on Singularity Hub from a recipe. It starts with this command:
In the Slurm job submission, it requests a GPU:
It takes about four minutes to run ASR on a standard one-hour recording. To DoChinese Red Hens report that the output makes sense, but has copious errors and disfluencies; to improve it, the audio should be cut at pauses or in word breaks rather than mechanically at ten-second intervals. A news content training dataset would also help. Thoughts |
@redhenlab™ > d. The Cognitive Core: Research Topics in Red Hen > The Barnyard of Possible Specific Projects >