Audio

On this page, we’ll dive into the different audio tasks you can use.

These tasks are all powered by the huggingface's transformers.js library. You can use any of the pre-trained ONNX models from the huggingface models or use your own.

Headers

  • Name
    x-api-group(optional)
    Type
    string, default 'main'
    Description

    The id of the peer-ai compute group you want to run this compute on.

  • Name
    x-api-key
    Type
    string
    Description

    The API key for your PeerAI account.


Automatic Speech Recognition

Automatic Speech Recognition (ASR) is the technology that converts spoken language into written text. It is commonly used in applications like transcription services, voice assistants, and more.

Body

  • Name
    task
    Type
    string
    Description

    The task of the pipeline. e.g., 'automatic-speech-recognition', 'text-to-speech'

  • Name
    model(optional)
    Type
    string, default null
    Description

    The name of the pre-trained model to use. If not specified, the default model for the task will be used.

  • Name
    inputs.0
    Type
    string
    Description

    The URL or path to the audio file to transcribe.

Request

curl -X POST https://api.peer-ai.com/v1/pipeline \
  -H "X-API-Group: {YOUR_COMPUTE_GROUP}" \
  -H "X-API-Key: {YOUR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d "{\"task\": \"automatic-speech-recognition\", \"model\": \"Xenova/whisper-small.en\", \"inputs\": [\"https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac\"]}"

Response

{
  "text": "I have a dream that one day this nation will rise up and live out the true meaning of its creed."
}

Text-to-Speech

Coming Soon


Audio Classification

Coming Soon


Audio-to-Audio

Coming Soon