You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of now, I typically use SAHI with a model that was trained on full frame images.
Very often, current models will scale down and pad the input images to a given target size and train on these scaled images.
Now when applying SAHI, the input image will be split into singular cutouts, depending on the set parameters, and the model will be applied to those cutouts. This means the model will again resize the given inputs to the target size it was trained on.
It is apparent, that there is a mismatch between scale in this scenario. However, SAHI still performs pretty well in my experience, which is probably due to the various augmentation applied during model training, e.g. mosaicing and random cutouts with newer YOLO versions.
To me, it seems like the overall detection performance could benefit from a detection model that was trained on the cutouts produced by SAHI in the first place. However, I didn't experiment on this myself and was wondering if anyone has experience with this setup and if a considerable performance increase could be observed.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
As of now, I typically use SAHI with a model that was trained on full frame images.
Very often, current models will scale down and pad the input images to a given target size and train on these scaled images.
Now when applying SAHI, the input image will be split into singular cutouts, depending on the set parameters, and the model will be applied to those cutouts. This means the model will again resize the given inputs to the target size it was trained on.
It is apparent, that there is a mismatch between scale in this scenario. However, SAHI still performs pretty well in my experience, which is probably due to the various augmentation applied during model training, e.g. mosaicing and random cutouts with newer YOLO versions.
To me, it seems like the overall detection performance could benefit from a detection model that was trained on the cutouts produced by SAHI in the first place. However, I didn't experiment on this myself and was wondering if anyone has experience with this setup and if a considerable performance increase could be observed.
Beta Was this translation helpful? Give feedback.
All reactions