Skip to content

latent-to-image #469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 12, 2025
Merged

latent-to-image #469

merged 6 commits into from
Feb 12, 2025

Conversation

hlky
Copy link
Contributor

@hlky hlky commented Feb 7, 2025

@@ -196,8 +198,12 @@ def check_inputs(inputs, tag):
IMAGE_OUTPUTS = {
"image-to-image",
"text-to-image",
"latent-to-image",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the new tag, as it maps to the existing tasks and I think we probably don't want to create this as an official task while this is experimental. But maybe this is harmless to exist only in the inference API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was added for

elif task in IMAGE_OUTPUTS:
image = outputs
image_format = parse_accept(accept, IMAGE)
buffer = io.BytesIO()
image.save(buffer, format=image_format.upper())
buffer.seek(0)
img_bytes = buffer.read()
return Response(
img_bytes,
headers=headers,
status_code=200,
media_type=f"image/{image_format}",
)

Seems to be the only usage of IMAGE_OUTPUTS

We also need the task in KNOWN_TASKS (via TENSOR_INPUTS)

if task not in KNOWN_TASKS:
msg = f"The task `{task}` is not recognized by api-inference-community"
logger.error(msg)
# Special case: despite the fact that the task comes from environment (which could be considered a service
# config error, thus triggering a 500), this var indirectly comes from the user
# so we choose to have a 400 here
return JSONResponse({"error": msg}, status_code=400)

Should be harmless for this to exist but I agree we don't want it to be an official task while this is experimental.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can maybe find ways to "override" another task for the time being (without involving another task)

@apolinario apolinario requested a review from Narsil February 7, 2025 15:13
@@ -128,7 +129,9 @@ async def pipeline_route(request: Request) -> Response:
return JSONResponse({"error": str(e)}, status_code=500)

try:
inputs, params = normalize_payload(payload, task, sampling_rate=sampling_rate)
inputs, params = normalize_payload(
payload, task, sampling_rate=sampling_rate, headers=headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not a good place to put the tensor shape in the headers. Why not just put it in the payload ?

I know base64 isn't perfect, but I'd rather go multipart upload than do something like this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also changing this requires a release of the api-inference-community package which I think will slow the whole process down (because I will be way more picky about what gets actually merged since it's published as a package therefore harder to take down).

Can we start with base64 and change afterwards for more optimal ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use base64.

Minimal example

import io
import base64
import requests
import torch
from PIL import Image
from safetensors.torch import _tobytes
inputs = torch.randn([1, 4, 64, 64], generator=torch.Generator().manual_seed(0))
shape = list(inputs.shape)
dtype = str(inputs.dtype).split(".")[-1]
tensor_data = base64.b64encode(_tobytes(inputs, "inputs")).decode("utf-8")
parameters = {"shape": shape, "dtype": dtype}
data = {"inputs": tensor_data, "parameters": parameters}

response = requests.post("http://127.0.0.1:8000/", json=data)
Image.open(io.BytesIO(response.content))

image

Generation example

Using Lykon/dreamshaper-8 finetune here for generation, can use stable-diffusion-v1-5/stable-diffusion-v1-5 for VAE.

import io
import base64
import requests
import torch
from PIL import Image
from safetensors.torch import _tobytes

from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "Lykon/dreamshaper-8",
    safety_checker=None,
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")

prompt = "a cute cat sitting beside the sea beach."
inputs = pipeline(
    prompt=prompt,
    negative_prompt="bad quality, worse quality, degenerate quality",
    output_type="latent",
).images
shape = list(inputs.shape)
dtype = str(inputs.dtype).split(".")[-1]
tensor_data = base64.b64encode(_tobytes(inputs, "inputs")).decode("utf-8")
parameters = {"shape": shape, "dtype": dtype}
data = {"inputs": tensor_data, "parameters": parameters}

response = requests.post("http://127.0.0.1:8000/", json=data)
Image.open(io.BytesIO(response.content))

image

@@ -196,8 +198,12 @@ def check_inputs(inputs, tag):
IMAGE_OUTPUTS = {
"image-to-image",
"text-to-image",
"latent-to-image",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can maybe find ways to "override" another task for the time being (without involving another task)

Comment on lines 432 to 433
if sys.byteorder == "big":
arr = torch.from_numpy(arr.numpy().byteswap(inplace=False))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can definitely assume byteorder is little.

hlky and others added 3 commits February 10, 2025 15:37
@Narsil Narsil merged commit 79e3e17 into huggingface:main Feb 12, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants