Skip to content

Add a public from_tensor builder to OcrInput#159

Closed
Henauxg wants to merge 1 commit intorobertknight:mainfrom
Henauxg:main
Closed

Add a public from_tensor builder to OcrInput#159
Henauxg wants to merge 1 commit intorobertknight:mainfrom
Henauxg:main

Conversation

@Henauxg
Copy link

@Henauxg Henauxg commented Feb 5, 2025

Thank you for the library,

I have a situation where I want to prepare the image (as in prepare_image) from a custom data source that is not backed by an ImageSource or NdTensorView . But currently, the only way to prepare the input/create an OcrInput is from an ImageSource via ``prepare_input`.

This PR adds a from_tensor builder to OcrInput which allows to have a custom prepare step to build a tensor, and then transform it to an OcrInput.

@robertknight
Copy link
Owner

If you have an NdTensor you can get an NdTensorView from it using the view method, like this:

let mut tensor = NdTensor::zeros([chans, width, height]);
// Fill tensor

let image_source = ocr_engine.prepare_input(ImageSource::from_tensor(tensor.view()));
// Pass source to OCR engine methods

The one downside of this is that ImageSource::from_tensor will create a copy of the tensor in order to normalize the input range to whatever the model expects (currently this means going from a [0, 1] input to [-0.5, 0.5]).

Is the issue here that it wasn't clear how to get an NdTensorView from a tensor, or was the performance of the copy, or the normalization, an issue?

@Henauxg
Copy link
Author

Henauxg commented Feb 6, 2025

I have been too hasty to discard the NdTensorView type. My original data is stored in a big bytes buffer (frame), and only a region (rect) is supposed to be processed.
I have successfully implemented what I needed with the following snippet with, I believe, zero data copy from frame into img_src (the only copy occurs during prepare_input to create the grey image) :

    let offset = CHANNELS_COUNT * (rect.origin.y * frame.width + rect.origin.x);
    let storage = frame.data[offset as usize..frame.data.len()].into_storage();
    let layout = NdLayout::from_shape_and_strides(
        [
            rect.size.height as usize,
            rect.size.width as usize,
            CHANNELS_COUNT as usize,
        ],
        [
            (frame.width * CHANNELS_COUNT) as usize,
            CHANNELS_COUNT as usize,
            1,
        ],
        OverlapPolicy::AllowOverlap,
    )?;
    let tensor_view = NdTensorView::from_storage_and_layout(storage, layout);
    let img_src = ImageSource::from_tensor(tensor_view, DimOrder::Hwc)?;
    let ocr_input = engine.prepare_input(img_src)?;

Thank you for your time and sorry for the inconvenience,

@Henauxg Henauxg closed this Feb 6, 2025
@robertknight
Copy link
Owner

I have successfully implemented what I needed with the following snippet with, I believe, zero data copy from frame into img_src (the only copy occurs during prepare_input to create the grey image) :

You are correct that there is no copy when creating img_src. You can also create an NdTensorView directly from a slice of elements using:

let img_src = NdTensorView::from_data(shape, slice)

Where slice.len() must match the product of shape. This works because slices implement the IntoStorage trait used by the second argument to from_data. The limitation is that this always uses default strides (the same as in your code snippet above).

Thank you for your time and sorry for the inconvenience,

There is no problem at all. I'm sure this issue will be a useful reference for someone else in future :)

@Henauxg
Copy link
Author

Henauxg commented Feb 6, 2025

For reference, I don't think that this last builder can by used in my case as I am not using the default strides (stride on the Y axis has to account for the offset between each byte being larger than rect.width due to the rect being a sub-region of the whole frame)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants