Skip to content

Commit cfc2ff5

Browse files
committed
Use HF Papers
1 parent edc37be commit cfc2ff5

File tree

123 files changed

+656
-656
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

123 files changed

+656
-656
lines changed

README.md

Lines changed: 156 additions & 156 deletions
Large diffs are not rendered by default.

hfdocs/source/changes.mdx

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,9 @@
3333

3434
## Nov 28, 2024
3535
* More optimizers
36-
* Add MARS optimizer (https://arxiv.org/abs/2411.10438, https://github.com/AGI-Arena/MARS)
37-
* Add LaProp optimizer (https://arxiv.org/abs/2002.04839, https://github.com/Z-T-WANG/LaProp-Optimizer)
38-
* Add masking from 'Cautious Optimizers' (https://arxiv.org/abs/2411.16085, https://github.com/kyleliang919/C-Optim) to Adafactor, Adafactor Big Vision, AdamW (legacy), Adopt, Lamb, LaProp, Lion, NadamW, RMSPropTF, SGDW
36+
* Add MARS optimizer (https://huggingface.co/papers/2411.10438, https://github.com/AGI-Arena/MARS)
37+
* Add LaProp optimizer (https://huggingface.co/papers/2002.04839, https://github.com/Z-T-WANG/LaProp-Optimizer)
38+
* Add masking from 'Cautious Optimizers' (https://huggingface.co/papers/2411.16085, https://github.com/kyleliang919/C-Optim) to Adafactor, Adafactor Big Vision, AdamW (legacy), Adopt, Lamb, LaProp, Lion, NadamW, RMSPropTF, SGDW
3939
* Cleanup some docstrings and type annotations re optimizers and factory
4040
* Add MobileNet-V4 Conv Medium models pretrained on in12k and fine-tuned in1k @ 384x384
4141
* https://huggingface.co/timm/mobilenetv4_conv_medium.e250_r384_in12k_ft_in1k
@@ -142,7 +142,7 @@ Add a set of new very well trained ResNet & ResNet-V2 18/34 (basic block) weight
142142
|hiera_small_abswin_256.sbb2_pd_e200_in12k_ft_in1k |84.560|97.106|35.01 |
143143

144144
### Aug 8, 2024
145-
* Add RDNet ('DenseNets Reloaded', https://arxiv.org/abs/2403.19588), thanks [Donghyun Kim](https://github.com/dhkim0225)
145+
* Add RDNet ('DenseNets Reloaded', https://huggingface.co/papers/2403.19588), thanks [Donghyun Kim](https://github.com/dhkim0225)
146146

147147
### July 28, 2024
148148
* Add `mobilenet_edgetpu_v2_m` weights w/ `ra4` mnv4-small based recipe. 80.1% top-1 @ 224 and 80.7 @ 256.
@@ -227,8 +227,8 @@ Add a set of new very well trained ResNet & ResNet-V2 18/34 (basic block) weight
227227
| [mobilenetv4_conv_small.e2400_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e2400_r224_in1k) |73.756|26.244 |91.422|8.578 |3.77 |224 |
228228
| [mobilenetv4_conv_small.e1200_r224_in1k](http://hf.co/timm/mobilenetv4_conv_small.e1200_r224_in1k) |73.454|26.546 |91.34 |8.66 |3.77 |224 |
229229

230-
* Apple MobileCLIP (https://arxiv.org/pdf/2311.17049, FastViT and ViT-B) image tower model support & weights added (part of OpenCLIP support).
231-
* ViTamin (https://arxiv.org/abs/2404.02132) CLIP image tower model & weights added (part of OpenCLIP support).
230+
* Apple MobileCLIP (https://huggingface.co/papers/2311.17049, FastViT and ViT-B) image tower model support & weights added (part of OpenCLIP support).
231+
* ViTamin (https://huggingface.co/papers/2404.02132) CLIP image tower model & weights added (part of OpenCLIP support).
232232
* OpenAI CLIP Modified ResNet image tower modelling & weight support (via ByobNet). Refactor AttentionPool2d.
233233

234234
### May 14, 2024
@@ -373,13 +373,13 @@ Datasets & transform refactoring
373373

374374
### Aug 25, 2023
375375
* Many new models since last release
376-
* FastViT - https://arxiv.org/abs/2303.14189
377-
* MobileOne - https://arxiv.org/abs/2206.04040
378-
* InceptionNeXt - https://arxiv.org/abs/2303.16900
379-
* RepGhostNet - https://arxiv.org/abs/2211.06088 (thanks https://github.com/ChengpengChen)
380-
* GhostNetV2 - https://arxiv.org/abs/2211.12905 (thanks https://github.com/yehuitang)
381-
* EfficientViT (MSRA) - https://arxiv.org/abs/2305.07027 (thanks https://github.com/seefun)
382-
* EfficientViT (MIT) - https://arxiv.org/abs/2205.14756 (thanks https://github.com/seefun)
376+
* FastViT - https://huggingface.co/papers/2303.14189
377+
* MobileOne - https://huggingface.co/papers/2206.04040
378+
* InceptionNeXt - https://huggingface.co/papers/2303.16900
379+
* RepGhostNet - https://huggingface.co/papers/2211.06088 (thanks https://github.com/ChengpengChen)
380+
* GhostNetV2 - https://huggingface.co/papers/2211.12905 (thanks https://github.com/yehuitang)
381+
* EfficientViT (MSRA) - https://huggingface.co/papers/2305.07027 (thanks https://github.com/seefun)
382+
* EfficientViT (MIT) - https://huggingface.co/papers/2205.14756 (thanks https://github.com/seefun)
383383
* Add `--reparam` arg to `benchmark.py`, `onnx_export.py`, and `validate.py` to trigger layer reparameterization / fusion for models with any one of `reparameterize()`, `switch_to_deploy()` or `fuse()`
384384
* Including FastViT, MobileOne, RepGhostNet, EfficientViT (MSRA), RepViT, RepVGG, and LeViT
385385
* Preparing 0.9.6 'back to school' release
@@ -396,7 +396,7 @@ Datasets & transform refactoring
396396

397397
### July 27, 2023
398398
* Added timm trained `seresnextaa201d_32x8d.sw_in12k_ft_in1k_384` weights (and `.sw_in12k` pretrain) with 87.3% top-1 on ImageNet-1k, best ImageNet ResNet family model I'm aware of.
399-
* RepViT model and weights (https://arxiv.org/abs/2307.09283) added by [wangao](https://github.com/jameslahm)
399+
* RepViT model and weights (https://huggingface.co/papers/2307.09283) added by [wangao](https://github.com/jameslahm)
400400
* I-JEPA ViT feature weights (no classifier) added by [SeeFun](https://github.com/seefun)
401401
* SAM-ViT (segment anything) feature weights (no classifier) added by [SeeFun](https://github.com/seefun)
402402
* Add support for alternative feat extraction methods and -ve indices to EfficientNet
@@ -506,9 +506,9 @@ Datasets & transform refactoring
506506

507507
### Feb 16, 2023
508508
* `safetensor` checkpoint support added
509-
* Add ideas from 'Scaling Vision Transformers to 22 B. Params' (https://arxiv.org/abs/2302.05442) -- qk norm, RmsNorm, parallel block
509+
* Add ideas from 'Scaling Vision Transformers to 22 B. Params' (https://huggingface.co/papers/2302.05442) -- qk norm, RmsNorm, parallel block
510510
* Add F.scaled_dot_product_attention support (PyTorch 2.0 only) to `vit_*`, `vit_relpos*`, `coatnet` / `maxxvit` (to start)
511-
* Lion optimizer (w/ multi-tensor option) added (https://arxiv.org/abs/2302.06675)
511+
* Lion optimizer (w/ multi-tensor option) added (https://huggingface.co/papers/2302.06675)
512512
* gradient checkpointing works with `features_only=True`
513513

514514
### Feb 7, 2023
@@ -596,11 +596,11 @@ Datasets & transform refactoring
596596

597597
### Jan 5, 2023
598598
* ConvNeXt-V2 models and weights added to existing `convnext.py`
599-
* Paper: [ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders](http://arxiv.org/abs/2301.00808)
599+
* Paper: [ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders](https://huggingface.co/papers/2301.00808)
600600
* Reference impl: https://github.com/facebookresearch/ConvNeXt-V2 (NOTE: weights currently CC-BY-NC)
601601
@dataclass
602602
### Dec 23, 2022 🎄☃
603-
* Add FlexiViT models and weights from https://github.com/google-research/big_vision (check out paper at https://arxiv.org/abs/2212.08013)
603+
* Add FlexiViT models and weights from https://github.com/google-research/big_vision (check out paper at https://huggingface.co/papers/2212.08013)
604604
* NOTE currently resizing is static on model creation, on-the-fly dynamic / train patch size sampling is a WIP
605605
* Many more models updated to multi-weight and downloadable via HF hub now (convnext, efficientnet, mobilenet, vision_transformer*, beit)
606606
* More model pretrained tag and adjustments, some model names changed (working on deprecation translations, consider main branch DEV branch right now, use 0.6.x for stable use)
@@ -624,7 +624,7 @@ Datasets & transform refactoring
624624
### Dec 6, 2022
625625
* Add 'EVA g', BEiT style ViT-g/14 model weights w/ both MIM pretrain and CLIP pretrain to `beit.py`.
626626
* original source: https://github.com/baaivision/EVA
627-
* paper: https://arxiv.org/abs/2211.07636
627+
* paper: https://huggingface.co/papers/2211.07636
628628

629629
| model | top1 | param_count | gmac | macts | hub |
630630
|:-----------------------------------------|-------:|--------------:|-------:|--------:|:----------------------------------------|
@@ -738,7 +738,7 @@ Datasets & transform refactoring
738738
* `maxvit_rmlp_nano_rw_256` - 83.0 @ 256, 83.6 @ 320 (T)
739739

740740
### Aug 26, 2022
741-
* CoAtNet (https://arxiv.org/abs/2106.04803) and MaxVit (https://arxiv.org/abs/2204.01697) `timm` original models
741+
* CoAtNet (https://huggingface.co/papers/2106.04803) and MaxVit (https://huggingface.co/papers/2204.01697) `timm` original models
742742
* both found in [`maxxvit.py`](https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/maxxvit.py) model def, contains numerous experiments outside scope of original papers
743743
* an unfinished Tensorflow version from MaxVit authors can be found https://github.com/google-research/maxvit
744744
* Initial CoAtNet and MaxVit timm pretrained weights (working on more):
@@ -834,7 +834,7 @@ More models, more fixes
834834
* `vit_relpos_base_patch16_gapcls_224` - 82.8 @ 224, 83.9 @ 320 -- rel pos, layer scale, class token, avg pool (by mistake)
835835
* Bring 512 dim, 8-head 'medium' ViT model variant back to life (after using in a pre DeiT 'small' model for first ViT impl back in 2020)
836836
* Add ViT relative position support for switching btw existing impl and some additions in official Swin-V2 impl for future trials
837-
* Sequencer2D impl (https://arxiv.org/abs/2205.01972), added via PR from author (https://github.com/okojoalg)
837+
* Sequencer2D impl (https://huggingface.co/papers/2205.01972), added via PR from author (https://github.com/okojoalg)
838838

839839
### May 2, 2022
840840
* Vision Transformer experiments adding Relative Position (Swin-V2 log-coord) (`vision_transformer_relpos.py`) and Residual Post-Norm branches (from Swin-V2) (`vision_transformer*.py`)
@@ -851,7 +851,7 @@ More models, more fixes
851851
* `seresnextaa101d_32x8d` (anti-aliased w/ AvgPool2d) - 83.85 @ 224, 84.57 @ 288
852852

853853
### March 23, 2022
854-
* Add `ParallelBlock` and `LayerScale` option to base vit models to support model configs in [Three things everyone should know about ViT](https://arxiv.org/abs/2203.09795)
854+
* Add `ParallelBlock` and `LayerScale` option to base vit models to support model configs in [Three things everyone should know about ViT](https://huggingface.co/papers/2203.09795)
855855
* `convnext_tiny_hnf` (head norm first) weights trained with (close to) A2 recipe, 82.2% top-1, could do better with more epochs.
856856

857857
### March 21, 2022
@@ -908,11 +908,11 @@ More models, more fixes
908908

909909
### Jan 5, 2023
910910
* ConvNeXt-V2 models and weights added to existing `convnext.py`
911-
* Paper: [ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders](http://arxiv.org/abs/2301.00808)
911+
* Paper: [ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders](https://huggingface.co/papers/2301.00808)
912912
* Reference impl: https://github.com/facebookresearch/ConvNeXt-V2 (NOTE: weights currently CC-BY-NC)
913913

914914
### Dec 23, 2022 🎄☃
915-
* Add FlexiViT models and weights from https://github.com/google-research/big_vision (check out paper at https://arxiv.org/abs/2212.08013)
915+
* Add FlexiViT models and weights from https://github.com/google-research/big_vision (check out paper at https://huggingface.co/papers/2212.08013)
916916
* NOTE currently resizing is static on model creation, on-the-fly dynamic / train patch size sampling is a WIP
917917
* Many more models updated to multi-weight and downloadable via HF hub now (convnext, efficientnet, mobilenet, vision_transformer*, beit)
918918
* More model pretrained tag and adjustments, some model names changed (working on deprecation translations, consider main branch DEV branch right now, use 0.6.x for stable use)
@@ -936,7 +936,7 @@ More models, more fixes
936936
### Dec 6, 2022
937937
* Add 'EVA g', BEiT style ViT-g/14 model weights w/ both MIM pretrain and CLIP pretrain to `beit.py`.
938938
* original source: https://github.com/baaivision/EVA
939-
* paper: https://arxiv.org/abs/2211.07636
939+
* paper: https://huggingface.co/papers/2211.07636
940940

941941
| model | top1 | param_count | gmac | macts | hub |
942942
|:-----------------------------------------|-------:|--------------:|-------:|--------:|:----------------------------------------|
@@ -1050,7 +1050,7 @@ More models, more fixes
10501050
* `maxvit_rmlp_nano_rw_256` - 83.0 @ 256, 83.6 @ 320 (T)
10511051

10521052
### Aug 26, 2022
1053-
* CoAtNet (https://arxiv.org/abs/2106.04803) and MaxVit (https://arxiv.org/abs/2204.01697) `timm` original models
1053+
* CoAtNet (https://huggingface.co/papers/2106.04803) and MaxVit (https://huggingface.co/papers/2204.01697) `timm` original models
10541054
* both found in [`maxxvit.py`](https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/maxxvit.py) model def, contains numerous experiments outside scope of original papers
10551055
* an unfinished Tensorflow version from MaxVit authors can be found https://github.com/google-research/maxvit
10561056
* Initial CoAtNet and MaxVit timm pretrained weights (working on more):
@@ -1147,7 +1147,7 @@ More models, more fixes
11471147
* `vit_relpos_base_patch16_gapcls_224` - 82.8 @ 224, 83.9 @ 320 -- rel pos, layer scale, class token, avg pool (by mistake)
11481148
* Bring 512 dim, 8-head 'medium' ViT model variant back to life (after using in a pre DeiT 'small' model for first ViT impl back in 2020)
11491149
* Add ViT relative position support for switching btw existing impl and some additions in official Swin-V2 impl for future trials
1150-
* Sequencer2D impl (https://arxiv.org/abs/2205.01972), added via PR from author (https://github.com/okojoalg)
1150+
* Sequencer2D impl (https://huggingface.co/papers/2205.01972), added via PR from author (https://github.com/okojoalg)
11511151

11521152
### May 2, 2022
11531153
* Vision Transformer experiments adding Relative Position (Swin-V2 log-coord) (`vision_transformer_relpos.py`) and Residual Post-Norm branches (from Swin-V2) (`vision_transformer*.py`)
@@ -1164,7 +1164,7 @@ More models, more fixes
11641164
* `seresnextaa101d_32x8d` (anti-aliased w/ AvgPool2d) - 83.85 @ 224, 84.57 @ 288
11651165

11661166
### March 23, 2022
1167-
* Add `ParallelBlock` and `LayerScale` option to base vit models to support model configs in [Three things everyone should know about ViT](https://arxiv.org/abs/2203.09795)
1167+
* Add `ParallelBlock` and `LayerScale` option to base vit models to support model configs in [Three things everyone should know about ViT](https://huggingface.co/papers/2203.09795)
11681168
* `convnext_tiny_hnf` (head norm first) weights trained with (close to) A2 recipe, 82.2% top-1, could do better with more epochs.
11691169

11701170
### March 21, 2022

0 commit comments

Comments
 (0)