Skip to content

ASR

The ASR endpoint allows you to transcribe audio files into text using models available on the SipPulse AI platform. You can submit an audio file and get the corresponding transcription.

Endpoints

GET /asr/models

This endpoint returns a list of all available ASR models.

Query Parameters

  • status (optional): The status of the models. Can be active or inactive. The default is active.

Example Request

bash
curl -X 'GET' \
  'https://api.sippulse.ai/v1/asr/models' \
  -H 'Content-Type: application/json' \
  -H 'api-key: $SIPPULSE_API_KEY'
python
import requests

url = "https://api.sippulse.ai/v1/asr/models"
headers = {
    "Content-Type": "application/json",
    "api-key": "SIPPULSE_API_KEY"
}

response = requests.get(url, headers=headers)
print(response.json())
javascript
const url = "https://api.sippulse.ai/v1/asr/models";
const headers = {
  "Content-Type": "application/json",
  "api-key": "SIPPULSE_API_KEY",
};

fetch(url, {
  method: "GET",
  headers: headers,
})
  .then((response) => response.json())
  .then((data) => console.log(data))
  .catch((error) => console.error("Error:", error));

Response Example

json
[
  {
    "name": "whisper-1",
    "status": "active"
    // ...
  }
]

POST /asr/transcribe

This endpoint transcribes an audio file into text using the specified model.

Query Parameters

  • model (required): The name of the model to be used for transcription.
  • language (optional): The language of the audio.
  • prompt (optional): An optional prompt to guide the transcription.
  • temperature (optional): Randomness control in transcription.
  • response_format (optional): The response format (text, json, vtt, srt, verbose_json).
  • input_sample_rate (optional): If the input file is a PCM file, specify the sample rate. The default is 8000.
  • instance (optional): If you want to use a specific instance, specify it here. Applicable only for instance models.

Request Body

The request body should be sent as multipart/form-data and should include the audio file to be transcribed.

json
{
  "file": "binary" // O arquivo de áudio a ser transcrito
}

Request Example

bash
curl -X 'POST' \
  'https://api.sippulse.ai/v1/asr/transcribe?model=whisper-1&language=en&response_format=json' \
  -H 'api-key: $SIPPULSE_API_KEY' \
  -F 'file=@path/to/audiofile.wav'
python
import requests

url = "https://api.sippulse.ai/v1/asr/transcribe"
params = {
    "model": "whisper-1",
    "language": "en",
    "response_format": "json"
}
headers = {
    "api-key": "SIPPULSE_API_KEY"
}
files = {
    "file": open("path/to/audiofile.wav", "rb")
}

response = requests.post(url, headers=headers, params=params, files=files)
print(response.json())
javascript
const fs = require("fs");

const url =
  "https://api.sippulse.ai/v1/asr/transcribe?model=whisper-1&language=en&response_format=json";
const headers = {
  "api-key": "SIPPULSE_API_KEY",
};

const fileStream = fs.createReadStream("path/to/audiofile.wav");

const formData = new FormData();
formData.append("file", fileStream);

fetch(url, {
  method: "POST",
  headers: headers,
  body: formData,
})
  .then((response) => response.json())
  .then((data) => console.log(data))
  .catch((error) => console.error("Error:", error));

Response Example

json
{
  "text": "This is the transcribed text of the audio file."
}

Conclusion

The ASR endpoint of SipPulse AI offers an efficient way to transcribe audio files into text, supporting various parameters to customize the transcription. Use the provided information and examples to integrate this functionality into your applications.