Audio

On this page, we’ll dive into the different audio tasks you can use.

These tasks are all powered by the huggingface's transformers.js library. You can use any of the pre-trained ONNX models from the huggingface models or use your own.

Headers

Name
x-api-group(optional)
Type
string, default 'main'
Description
The id of the peer-ai compute group you want to run this compute on.
Name
x-api-key
Type
string
Description
The API key for your PeerAI account.

Automatic Speech Recognition

Automatic Speech Recognition (ASR) is the technology that converts spoken language into written text. It is commonly used in applications like transcription services, voice assistants, and more.

Body

Name
task
Type
string
Description
The task of the pipeline. e.g., 'automatic-speech-recognition', 'text-to-speech'
Name
model(optional)
Type
string, default null
Description
The name of the pre-trained model to use. If not specified, the default model for the task will be used.
Name
inputs.0
Type
string
Description
The URL or path to the audio file to transcribe.

Request

curl -X POST https://api.peer-ai.com/v1/pipeline \
  -H "X-API-Group: {YOUR_COMPUTE_GROUP}" \
  -H "X-API-Key: {YOUR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d "{\"task\": \"automatic-speech-recognition\", \"model\": \"Xenova/whisper-small.en\", \"inputs\": [\"https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac\"]}"

Response

{
  "text": "I have a dream that one day this nation will rise up and live out the true meaning of its creed."
}

Text-to-Speech

Coming Soon

Audio Classification

Coming Soon

Audio-to-Audio

Coming Soon