Releases: huggingface/pytorch-image-models
Releases · huggingface/pytorch-image-models
Release v0.9.16
Feb 19, 2024
- Next-ViT models added. Adapted from https://github.com/bytedance/Next-ViT
- HGNet and PP-HGNetV2 models added. Adapted from https://github.com/PaddlePaddle/PaddleClas by SeeFun
- Removed setup.py, moved to pyproject.toml based build supported by PDM
- Add updated model EMA impl using _for_each for less overhead
- Support device args in train script for non GPU devices
- Other misc fixes and small additions
- Min supported Python version increased to 3.8
- Release 0.9.16
Jan 8, 2024
Datasets & transform refactoring
- HuggingFace streaming (iterable) dataset support (
--dataset hfids:org/dataset
) - Webdataset wrapper tweaks for improved split info fetching, can auto fetch splits from supported HF hub webdataset
- Tested HF
datasets
and webdataset wrapper streaming from HF hub with recenttimm
ImageNet uploads to https://huggingface.co/timm - Make input & target column/field keys consistent across datasets and pass via args
- Full monochrome support when using e:g:
--input-size 1 224 224
or--in-chans 1
, sets PIL image conversion appropriately in dataset - Improved several alternate crop & resize transforms (ResizeKeepRatio, RandomCropOrPad, etc) for use in PixParse document AI project
- Add SimCLR style color jitter prob along with grayscale and gaussian blur options to augmentations and args
- Allow train without validation set (
--val-split ''
) in train script - Add
--bce-sum
(sum over class dim) and--bce-pos-weight
(positive weighting) args for training as they're common BCE loss tweaks I was often hard coding
Release v0.9.12
Nov 23, 2023
- Added EfficientViT-Large models, thanks SeeFun
- Fix Python 3.7 compat, will be dropping support for it soon
- Other misc fixes
- Release 0.9.12
Release v0.9.11
Nov 20, 2023
- Added significant flexibility for Hugging Face Hub based timm models via
model_args
config entry.model_args
will be passed as kwargs through to models on creation. - Updated imagenet eval and test set csv files with latest models
vision_transformer.py
typing and doc cleanup by Laureηt- 0.9.11 release
Release v0.9.10
Nov 4
- Patch fix for 0.9.9 to fix FrozenBatchnorm2d import path for old torchvision (~2 years )
Nov 3, 2023
- DFN (Data Filtering Networks) and MetaCLIP ViT weights added
- DINOv2 'register' ViT model weights added
- Add
quickgelu
ViT variants for OpenAI, DFN, MetaCLIP weights that use it (less efficient) - Improved typing added to ResNet, MobileNet-v3 thanks to Aryan
- ImageNet-12k fine-tuned (from LAION-2B CLIP)
convnext_xxlarge
- 0.9.9 release
Release v0.9.9
Nov 3, 2023
- DFN (Data Filtering Networks) and MetaCLIP ViT weights added
- DINOv2 'register' ViT model weights added
- Add
quickgelu
ViT variants for OpenAI, DFN, MetaCLIP weights that use it (less efficient) - Improved typing added to ResNet, MobileNet-v3 thanks to Aryan
- ImageNet-12k fine-tuned (from LAION-2B CLIP)
convnext_xxlarge
- 0.9.9 release
Release v0.9.8
Oct 20, 2023
- SigLIP image tower weights supported in
vision_transformer.py
.- Great potential for fine-tune and downstream feature use.
- Experimental 'register' support in vit models as per Vision Transformers Need Registers
- Updated RepViT with new weight release. Thanks wangao
- Add patch resizing support (on pretrained weight load) to Swin models
- 0.9.8 release
Release v0.9.7
Release v0.9.6
Aug 28, 2023
- Add dynamic img size support to models in
vision_transformer.py
,vision_transformer_hybrid.py
,deit.py
, andeva.py
w/o breaking backward compat.- Add
dynamic_img_size=True
to args at model creation time to allow changing the grid size (interpolate abs and/or ROPE pos embed each forward pass). - Add
dynamic_img_pad=True
to allow image sizes that aren't divisible by patch size (pad bottom right to patch size each forward pass). - Enabling either dynamic mode will break FX tracing unless PatchEmbed module added as leaf.
- Existing method of resizing position embedding by passing different
img_size
(interpolate pretrained embed weights once) on creation still works. - Existing method of changing
patch_size
(resize pretrained patch_embed weights once) on creation still works. - Example validation cmd
python validate.py /imagenet --model vit_base_patch16_224 --amp --amp-dtype bfloat16 --img-size 255 --crop-pct 1.0 --model-kwargs dynamic_img_size=True dyamic_img_pad=True
- Add
Aug 25, 2023
- Many new models since last release
- FastViT - https://arxiv.org/abs/2303.14189
- MobileOne - https://arxiv.org/abs/2206.04040
- InceptionNeXt - https://arxiv.org/abs/2303.16900
- RepGhostNet - https://arxiv.org/abs/2211.06088 (thanks https://github.com/ChengpengChen)
- GhostNetV2 - https://arxiv.org/abs/2211.12905 (thanks https://github.com/yehuitang)
- EfficientViT (MSRA) - https://arxiv.org/abs/2305.07027 (thanks https://github.com/seefun)
- EfficientViT (MIT) - https://arxiv.org/abs/2205.14756 (thanks https://github.com/seefun)
- Add
--reparam
arg tobenchmark.py
,onnx_export.py
, andvalidate.py
to trigger layer reparameterization / fusion for models with any one ofreparameterize()
,switch_to_deploy()
orfuse()
- Including FastViT, MobileOne, RepGhostNet, EfficientViT (MSRA), RepViT, RepVGG, and LeViT
- Preparing 0.9.6 'back to school' release
Aug 11, 2023
- Swin, MaxViT, CoAtNet, and BEiT models support resizing of image/window size on creation with adaptation of pretrained weights
- Example validation cmd to test w/ non-square resize
python validate.py /imagenet --model swin_base_patch4_window7_224.ms_in22k_ft_in1k --amp --amp-dtype bfloat16 --input-size 3 256 320 --model-kwargs window_size=8,10 img_size=256,320
Release v0.9.5
Minor updates and bug fixes. New ResNeXT w/ highest ImageNet eval I'm aware of in the ResNe(X)t family (seresnextaa201d_32x8d.sw_in12k_ft_in1k_384
)
Aug 3, 2023
- Add GluonCV weights for HRNet w18_small and w18_small_v2. Converted by SeeFun
- Fix
selecsls*
model naming regression - Patch and position embedding for ViT/EVA works for bfloat16/float16 weights on load (or activations for on-the-fly resize)
- v0.9.5 release prep
July 27, 2023
- Added timm trained
seresnextaa201d_32x8d.sw_in12k_ft_in1k_384
weights (and.sw_in12k
pretrain) with 87.3% top-1 on ImageNet-1k, best ImageNet ResNet family model I'm aware of. - RepViT model and weights (https://arxiv.org/abs/2307.09283) added by wangao
- I-JEPA ViT feature weights (no classifier) added by SeeFun
- SAM-ViT (segment anything) feature weights (no classifier) added by SeeFun
- Add support for alternative feat extraction methods and -ve indices to EfficientNet
- Add NAdamW optimizer
- Misc fixes
Release v0.9.2
- Fix _hub deprecation pass through import