Skip to content

Text Playground

Overview

The SipPulse AI speech to text playground allows you to test and compare the performance of available audio transcription models on the platform. This interactive environment facilitates the conversion of audio files into text, supporting various output formats and automatic translation.

Key Features

Audio Upload

In the playground, you can upload audio files in two ways:

  • Drag and Drop: Drag the audio file to the designated area.
  • Select File: Click on the designated area to choose an audio file from your device.

Parameter Settings

When selecting a model, the playground presents the available parameters for adjustment. These parameters may include:

  • Model: Choose the desired audio transcription model. The available models are optimized for different tasks and languages.

  • Format: Choose the desired output format. The supported formats are:

    • Text: Simple transcription in text format.
    • JSON: Output in JSON format.
    • VTT: WebVTT format, used for video subtitles.
    • SRT: SubRip Subtitle format, also used for subtitles.
    • Verbose JSON: Detailed JSON, including additional information about the transcription.
  • Language: Select the language of the audio to be transcribed.

TIP

If the audio language is different from the selected language, the response will be the translation of the audio into the selected language.

Instructions

You can add specific instructions for the model, guiding how the transcription should be done. This field is optional but can help achieve more accurate results according to your needs.

Test Execution

After adjusting the parameters and uploading the audio, you can start the test by clicking the Transcribe button. The model will process the audio and display the transcription in the selected format.

Code Visualization

The playground includes a View Code button, which shows how to integrate the tested model and parameters into your own applications. The integration code can be viewed in different languages, including Curl, Python, and JavaScript.

Usage Example

Let's suppose you want to transcribe an audio file using the whisper-1 model with a specific configuration:

  1. Select whisper-1 from the model menu.
  2. Upload the Audio: Drag and drop the audio file into the designated area or click to select the file.
  3. Adjust the Parameters:
    • Format: Text
    • Language: Portuguese
    • Instruction: (Optional) "Transcribe with the highest accuracy possible."
  4. Execute the Test: Click Transcribe to see the audio transcription.
  5. View Code: Get the integration code by clicking View Code and choose your preferred language (Curl, Python, or JavaScript).

Conclusion

The speech to text playground is a powerful tool for testing and comparing audio transcription models, allowing detailed configuration and real-time visualization of results. Use this functionality to optimize your audio transcription solutions and easily integrate the tested models into your applications.