Skip to content

Commit 9fd64b7

Browse files
committed
docs(ai): add price
1 parent 026c8e3 commit 9fd64b7

File tree

1 file changed

+40
-23
lines changed

1 file changed

+40
-23
lines changed

ai/pipelines/image-to-text.mdx

+40-23
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,10 @@ title: Image-to-Text
44

55
## Overview
66

7-
The `image-to-text` pipeline converts images into text captions. This pipeline is powered by the latest models in the HuggingFace [text-to-image](https://huggingface.co/models?pipeline_tag=text-to-image) pipeline.
7+
The `image-to-text` pipeline converts images into text captions. This pipeline
8+
is powered by the latest models in the HuggingFace
9+
[text-to-image](https://huggingface.co/models?pipeline_tag=text-to-image)
10+
pipeline.
811

912
<div align="center">
1013

@@ -19,10 +22,10 @@ The current warm model requested for the `image-to-text` pipeline is:
1922
- [Salesforce/blip-image-captioning-large](https://huggingface.co/Salesforce/blip-image-captioning-large)
2023

2124
<Tip>
22-
For faster responses with different
23-
[image-to-text](https://huggingface.co/models?pipeline_tag=text-to-image)
24-
diffusion models, ask Orchestrators to load it on their GPU via the `ai-video`
25-
channel in [Discord Server](https://discord.gg/livepeer).
25+
For faster responses with different
26+
[image-to-text](https://huggingface.co/models?pipeline_tag=text-to-image)
27+
diffusion models, ask Orchestrators to load it on their GPU via the `ai-video`
28+
channel in [Discord Server](https://discord.gg/livepeer).
2629
</Tip>
2730

2831
### On-Demand Models
@@ -31,9 +34,9 @@ The following models have been tested and verified for the `image-to-text`
3134
pipeline:
3235

3336
<Note>
34-
If a specific model you wish to use is not listed, please submit a [feature
35-
request](https://github.com/livepeer/ai-worker/issues/new?assignees=&labels=enhancement%2Cmodel&projects=&template=model_request.yml)
36-
on GitHub to get the model verified and added to the list.
37+
If a specific model you wish to use is not listed, please submit a [feature
38+
request](https://github.com/livepeer/ai-worker/issues/new?assignees=&labels=enhancement%2Cmodel&projects=&template=model_request.yml)
39+
on GitHub to get the model verified and added to the list.
3740
</Note>
3841

3942
{/* prettier-ignore */}
@@ -44,13 +47,13 @@ pipeline:
4447
## Basic Usage Instructions
4548

4649
<Tip>
47-
For a detailed understanding of the `image-to-text` endpoint and to experiment
48-
with the API, see the [Livepeer AI API
49-
Reference](/ai/api-reference/image-to-text).
50+
For a detailed understanding of the `image-to-text` endpoint and to experiment
51+
with the API, see the [Livepeer AI API
52+
Reference](/ai/api-reference/image-to-text).
5053
</Tip>
5154

52-
To create an image caption using the `image-to-text` pipeline, submit a
53-
`POST` request to the Gateway's `image-to-text` API endpoint:
55+
To create an image caption using the `image-to-text` pipeline, submit a `POST`
56+
request to the Gateway's `image-to-text` API endpoint:
5457

5558
```bash
5659
curl -X POST "https://<GATEWAY_IP>/image-to-text" \
@@ -64,9 +67,7 @@ In this command:
6467
- `model_id` is the diffusion model to use.
6568
- `image` is the path to the image file to be captioned.
6669

67-
<Note>
68-
Maximum request size: 50 MB
69-
</Note>
70+
<Note>Maximum request size: 50 MB</Note>
7071

7172
For additional optional parameters, refer to the
7273
[Livepeer AI API Reference](/ai/api-reference/image-to-text).
@@ -80,16 +81,32 @@ the [Orchestrator Configuration](/ai/orchestrators/get-started) guide.
8081

8182
The following system requirements are recommended for optimal performance:
8283

83-
- [NVIDIA GPU](https://developer.nvidia.com/cuda-gpus) with **at least 12GB** of
84-
VRAM.
84+
- [NVIDIA GPU](https://developer.nvidia.com/cuda-gpus) with **at least 4GB** of
85+
VRAM.
86+
87+
88+
## Recommended Pipeline Pricing
89+
90+
<Note>
91+
We are planning to simplify the pricing in the future so orchestrators can set
92+
one AI price per compute unit and have the system automatically scale based on
93+
the model's compute requirements.
94+
</Note>
95+
96+
The pricing for the `image-to-text` pipeline is based on competitor pricing.
97+
However, we strongly encourage orchestrators to set their own pricing based on
98+
their costs and requirements. Setting a competitive price will help attract more
99+
jobs, as Gateways can set their maximum price for a job. The current recommended
100+
pricing for this pipeline is `2.5e-10 USD` per **input pixel**
101+
(`height * width`).
85102

86103
## API Reference
87104

88105
<Card
89-
title="API Reference"
90-
icon="rectangle-terminal"
91-
href="/ai/api-reference/image-to-text"
106+
title="API Reference"
107+
icon="rectangle-terminal"
108+
href="/ai/api-reference/image-to-text"
92109
>
93-
Explore the `image-to-text` endpoint and experiment with the API in the
94-
Livepeer AI API Reference.
110+
Explore the `image-to-text` endpoint and experiment with the API in the
111+
Livepeer AI API Reference.
95112
</Card>

0 commit comments

Comments
 (0)