Transcribe audio with OpenAI Whisper API!

Home AI News Transcribe audio with OpenAI Whisper API!

Transcribe audio with OpenAI Whisper API!

Introduction
Recording Audio in Bubble
Saving Audio to Bubble Storage
Retrieving Audio from Bubble Storage
Displaying Audio Files in a Repeating Group
Generating Transcripts with Open AI Whisper API
Troubleshooting File Formats
Separate Workflow Actions for Saving and Generating Transcripts
Potential Delay Issues with File Accessibility
Conclusion

Introduction In this article, we will learn how to record audio in Bubble and utilize the Open AI Whisper API to generate AI-generated transcripts. We will explore the step-by-step process of recording audio, saving it to Bubble storage, retrieving the audio files, displaying them in a repeating group, and generating transcripts using the Open AI Whisper API. Additionally, we will address common troubleshooting issues and provide solutions for file formats and potential delays in file accessibility during the workflow.

Recording Audio in Bubble

To Record audio in Bubble, we will use Bubble's own Audio Recorder and Visualizer. While there are other available plugins for audio recording, this option is free and suitable for our demonstration purposes. It is important to note that this audio recorder saves the audio as a WAV format, which may result in slightly larger file sizes compared to MP3 formats. However, it functions appropriately for our demonstration.

Saving Audio to Bubble Storage

To save recorded audio to Bubble storage, we need to configure the workflow. Firstly, we set up the "start stop audio recorder action" and add a Second action for saving the audio. The save button, provided by the plugin, allows us to upload the content of the recorded audio to Bubble storage, which is part of AWS S3. This step ensures that the audio file is saved within our Bubble app and can be accessed later.

Retrieving Audio from Bubble Storage

In order to retrieve the saved audio file from Bubble storage, we Create a data Type called "audio recording." Within this data type, we include a file field of type File and insert the results of the previous step. By doing so, we save the audio file into our Bubble app and provide a means for our database to retrieve the file when needed.

Displaying Audio Files in a Repeating Group

To display all entries of the "audio recording" data type in a repeating group, we utilize Bubble's repeating group element. Within this repeating group, we can print the audio recording's file URL. However, it is crucial to note that the URL may not start with "https:". We will address this issue and ensure the correct URL format is established.

Generating Transcripts with Open AI Whisper API

To generate transcripts using the Open AI Whisper API, we include a button labeled "get transcript" within the repeating group. When this button is clicked, a workflow is triggered, and the Open AI Whisper API response is saved as text within the corresponding audio recording entry. This step allows us to retrieve and display the generated transcripts.

Troubleshooting File Formats

One potential issue that may arise during the setup process is ensuring the correct file format is provided for the Whisper API. Open AI Whisper requires a publicly accessible audio or video file in specific formats. We will address how to format the file correctly within our Bubble app and ensure it meets the requirements of the Open AI Whisper API.

Separate Workflow Actions for Saving and Generating Transcripts

Initially, we attempted to set up a single workflow action that encompassed starting the recording, stopping it, saving it to the database, and sending it directly to the Whisper API. However, this approach resulted in errors suggesting that the file provided to the Whisper API was inaccessible or in the incorrect format. To overcome this issue, we will separate the workflow actions into saving the audio and generating the transcripts separately.

Potential Delay Issues with File Accessibility

There is a possibility of encountering delays in file accessibility, primarily when submitting the file URL to the Whisper API. This delay may occur due to Bubble passing on the file URL slightly before the Whisper API can access it. To mitigate this issue, we will break the workflow into a save command and a separate generate transcript command. This restructuring ensures that the file is fully accessible before submitting it to the Whisper API, leading to successful transcript generation.

Conclusion

In conclusion, this article has provided a comprehensive guide on recording audio in Bubble and integrating the Open AI Whisper API to generate transcripts. We have explored various steps, including saving audio to Bubble storage, retrieving the audio files, displaying them in a repeating group, and troubleshooting common issues. By following this guide, users can successfully incorporate audio recording and transcription capabilities into their Bubble applications, enhancing the user experience and accessibility of audio content.

Automate Text Classification with Microsoft Power Automate

Master Garmin's QuickDraw Contours!