Whisper Lesson 0 – Introduction: OpenAI Whisper Speech-to-Text

Hi and welcome to this tutorial series on the OpenAI Whisper speech-to-text model. Whisper is a very powerful automatic speech recognition system and in this series, we’re going to learn all about it and create cool projects along the way.

In part 1 we’ll take a look at the basics of setting up and using the Whisper library to transcribe audio files on your local computer.
In the next part, we are going to create a podcast application with a user interface where the user will be able to input any Google Podcasts link and they will get a transcript and summary of the podcast and even subtitle files for good measure.
In part 3 we’ll look at dealing with transcribing video files by creating an application where the user inputs any video file and the output will be that same video file but with subtitles embedded in it.
Finally, in the last part, we’ll take a look at alternatives, first looking at faster-whisper to speed things up, and then looking at using the Web-API version that runs in the cloud. We’ll create a final video-to-quiz application to show how the Web-API version works.

So I hope you’re excited to learn about Whisper and let’s get started!