Skip to content

Conversation

@YToleubay
Copy link
Contributor

added enable_optimization and compare_optimization flags to use optimized code and to compare the results of the optimized code with the original code.

added embedding_concat_optimized that uses PyTorch and performs checkerboard pattern downscaling and concatenation.

added embedding_concat_numpy that uses NumPy and performs checkerboard pattern downscaling and concatenation.

added postprocess_optimized

added infer_optimized that works through vectorization

added infer_init_run to run padim with dummy data to ensure hot inference later

On RTX 4060 Laptop GPU, Optimized code infers default ./bottle_000.png in average 10.25 ms while original code takes 203.0 ms

@YToleubay YToleubay requested a review from kyakuno May 12, 2024 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants