Speech-to-Text Using OpenAI Whisper.

Created using ChatSlide

This presentation introduces OpenAI Whisper, focusing on its Automatic Speech Recognition (ASR) technology and applications. It explores transcription pipelines with scalable methodologies, including speaker diarisation using PyAnnote.audio. Emphasis is placed on optimising real-time processing through quantisation and speculative decoding. The discussion extends to Whisper’s multilingual capabilities, searchable audio features, and future innovations. Additionally, it highlights methods for...

Make your own slides with ChatSlide