Multimodal
On this page, we’ll dive into the different types of multimodal tasks you can run on PeerAI.
These tasks are all powered by the huggingface's transformers.js library. You can use any of the pre-trained ONNX models from the huggingface models or use your own.
Headers
- Name
x-api-group(optional)- Type
- string, default 'main'
- Description
The id of the peer-ai compute group you want to run this compute on.
- Name
x-api-key- Type
- string
- Description
The API key for your PeerAI account.
Image Captioning
The Image Captioning pipeline generates a caption for an image. The pipeline uses a pre-trained model to generate the caption.
Body
- Name
task- Type
- string
- Description
The task of the pipeline. Use 'image-to-text' for this pipeline.
- Name
model(optional)- Type
- string, default null
- Description
The name of the pre-trained model to use. If not specified, the default model for the task will be used.
- Name
inputs.0- Type
- string
- Description
The URL of the image to analyze.
Request
curl -X POST https://api.peer-ai.com/v1/pipeline \
-H "X-API-Group: {YOUR_COMPUTE_GROUP}" \
-H "X-API-Key: {YOUR_API_KEY}" \
-H "Content-Type: application/json" \
-d "{\"task\": \"image-to-text\", \"inputs\": [\"https://example.com/image.jpg\"]}"
Response
[
{
"generated_text": "a brown and white striped zebra laying on a tree stump"
}
]
Zero-Shot Image Classification
Zero-shot image classification is the process of classifying an image into predefined categories without the need for training on specific labeled data. It allows you to classify images based on a set of target labels, even if those labels were not part of the training data.
Body
- Name
task- Type
- string
- Description
The task of the pipeline. e.g., 'zero-shot-image-classification', 'text-classification'
- Name
model(optional)- Type
- string, default null
- Description
The name of the pre-trained model to use. If not specified, the default model for the task will be used.
- Name
inputs.0- Type
- string
- Description
The URL of the image to classify.
- Name
inputs.1- Type
- array
- Description
An array of target labels to classify the image into.
Request
curl -X POST https://api.peer-ai.com/v1/pipeline \
-H "X-API-Group: {YOUR_COMPUTE_GROUP}" \
-H "X-API-Key: {YOUR_API_KEY}" \
-H "Content-Type: application/json" \
-d "{\"task\": \"zero-shot-image-classification\", \"inputs\": [\"https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg\", [\"tiger\", \"cat\", \"dog\"]]}"
Response
[
{
"score": 0.995894193649292,
"label": "tiger"
},
{
"score": 0.003875702852383256,
"label": "cat"
},
{
"score": 0.00023012972087599337,
"label": "dog"
}
]
Feature Extraction
Transforming raw data into numerical features that can be processed while preserving the information in the original dataset.
Coming Soon
Document Question Answering
Answering questions on document images.
Coming Soon
Visual Question Answering
Answering questions on images.
Coming Soon