Should we use torch's SiLU to replace Swish
when alpha=1
#2787
yiheng-wang-nv
started this conversation in
Ideas
Replies: 2 comments
-
My local test with jupyter notebook: GPU mode:
CPU mode:
|
Beta Was this translation helpful? Give feedback.
0 replies
-
If the user wanted to use That said, I have no problem with reverting to SiLU for alpha=1. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @wyli @Nic-Ma @rijobro @ericspod
When I checked our efficientnet implementation, I found it used MONAI's implementation of
Switch
, it is more generic than pytorch's implementation ( see: SiLU) since we can change the parameteralpha
.However, when using default value
1
foralpha
in some cases such as efficientnet, the calculation speed for pytorch's SiLU is faster thanSwish
. Therefore, shall we modify the implementation and at least use SiLU whenalpha=1
?Beta Was this translation helpful? Give feedback.
All reactions