Skip to content

Universal Reader having issues with anything but image paths #77

Open
@mintary

Description

Hi there, forewarning that I have very little experience with image processing and Torch. I have not touched the configuration files at the moment. Currently trying to pass URLs to the analyzer, but I keep running into the following error:

"File \"C:\\Users\\win1\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\torchvision\\transforms\\_functional_tensor.py\", line 926, in normalize 
if std.ndim == 1:        
std = std.view(-1, 1, 1)
return tensor.sub_(mean).div_(std)
~~~~~~~~~~~ <--- HERE
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 1"

Replacing the configuration variables with their values for bath_size, fix_img_size, return_img_data, and include_tensors, this is my current endpoint:

@app.post("/predict/")
async def predict(url: str):
    try: 
        response = analyzer.run(
            image_source=url,
            batch_size=8,
            fix_img_size=True,
            return_img_data=False,
            include_tensors=True,
            path_output=None,
        )

Here's an example output:

{"asctime": "2024-08-16 14:57:30,765", "levelname": "INFO", "message": "Running FaceAnalyzer", "taskName": "Task-12"}
{"asctime": "2024-08-16 14:57:30,766", "levelname": "INFO", "message": "Reading image", "taskName": "Task-12", "input": "https://image-cdn.essentiallysports.com/wp-content/uploads/20200606234527/the-rock-dwayne-johnson-muscles-740x662.png"}
{"asctime": "2024-08-16 14:57:33,463", "levelname": "INFO", "message": "Detecting faces", "taskName": "Task-12"}
INFO:     127.0.0.1:63490 - "POST /predict/?url=https%3A%2F%2Fimage-cdn.essentiallysports.com%2Fwp-content%2Fuploads%2F20200606234527%2Fthe-rock-dwayne-johnson-muscles-740x662.png HTTP/1.1" 500 Internal Server Error

I also tried to pass a PIL Image.Image object directly (again, I need to first convert this into RGB form which makes sense, the same dimension-matching error pops up if I do not), but despite the type matching, it appears that the detector is not finding any faces, i.e.:

@app.post("/predict/")
async def predict(file: UploadFile = File(...)):
    try: 
        image = Image.open(BytesIO(await file.read())).convert('RGB')

        response = analyzer.run(
            image_source=image,
            batch_size=8,
            fix_img_size=True,
            return_img_data=False,
            include_tensors=True,
            path_output=None,
        )

With output:

{"asctime": "2024-08-16 15:21:51,393", "levelname": "INFO", "message": "Running FaceAnalyzer", "taskName": "Task-7"}
{"asctime": "2024-08-16 15:21:51,393", "levelname": "INFO", "message": "Reading image", "taskName": "Task-7", "input": "<PIL.Image.Image image mode=RGB size=978x605 at 0x2446A6248C0>"}
{"asctime": "2024-08-16 15:21:51,678", "levelname": "INFO", "message": "Detecting faces", "taskName": "Task-7"}
{"asctime": "2024-08-16 15:22:00,407", "levelname": "INFO", "message": "Number of faces: 0", "taskName": "Task-7"}
Response(faces=[], version='0.5.0')

Would appreciate any help! The reader works perfectly fine if an image path is given.

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions