Tag

#automatic-speech-recognition

Every item tagged automatic-speech-recognition, newest first.

3 items

Making automatic speech recognition work on large files with Wav2Vec2 in 🤗 Transformers

The Hugging Face Transformers library now supports chunking large audio files for automatic speech recognition (ASR) with Wav2Vec2. This allows processing files over 1 minute long by splitting them into smaller chunks. You can integrate this feature into your ASR workflows to handle longer audio inputs.

Key takeaways

Wav2Vec2 in Transformers supports chunking for large audio files.
Chunking enables processing audio over 1 minute long.
Improves ASR usability for longer audio inputs.

HHugging Face Blog#automatic-speech-recognition #large-audio-files #chunking

modelsNov 15

Fine-Tune XLSR-Wav2Vec2 for low-resource ASR with 🤗 Transformers

You can fine-tune the pre-trained XLSR-Wav2Vec2 model for automatic speech recognition (ASR) on low-resource languages using the Hugging Face Transformers library. This approach enables adapting the model to specific languages with limited training data. Fine-tuning allows you to leverage the model's pre-trained knowledge and adapt it to your specific use case.

Key takeaways

XLSR-Wav2Vec2 can be fine-tuned for low-resource ASR.
Hugging Face Transformers library supports this fine-tuning process.
Fine-tuning adapts the model to specific languages with limited training data.

HHugging Face Blog#fine-tuning #low-resource-languages #automatic-speech-recognition

modelsMar 12

Fine-Tune Wav2Vec2 for English ASR in Hugging Face with 🤗 Transformers

You can fine-tune Wav2Vec2 for English automatic speech recognition (ASR) using Hugging Face's Transformers library. The model achieves state-of-the-art results on the LibriSpeech dataset. Fine-tuning allows you to adapt the model to your specific use case or dataset. This approach can improve performance and reduce the need for large amounts of labeled training data.

Key takeaways

Wav2Vec2 can be fine-tuned for English ASR with Hugging Face's Transformers.
Achieves state-of-the-art results on LibriSpeech dataset.
Fine-tuning adapts the model to specific use cases or datasets.

HHugging Face Blog#fine-tuning #automatic-speech-recognition #open-source