GitHub - SnifferCaptain/YActivations: SnifferCaptain's Activation including Snake, SFReLU

Snake

Snake Activation For Pytorch

Happy Snake Year 2025!

what's Snake?

snake is an easy activation function.

SnakeA is: y = tanh(x) + relu(x)
SnakeB is: y = tanh(x) + silu(x)
SnakeC is: y = erf(x) + gelu(x)

their graphs are as follows:

SnakeA

SnakeB

SnakeC

DytSnakeB

reason

I've notice that fashion activations like SiLU, GeLU, ReLU as well as Mish are the kind of self-gated activations. They have one in common which is they are closely zero while the input is negative. According to the paper Searching for Activation Functions (which Introduced swish activation), they found that most input values of the swish activation are in negative part, which in my opinion shows that the well-trained net is "eager to learn something nagetive". Other than that, if we follows the units like "conv-bn-act" or "linear-act", the output of the activation will be the input of the next layer's linear weight, not bias. and the input of value will be a closely zero if the network is "eager to learn something negative", then grad will be hard to flow through this part.
So, Snake activation is an activation more like ELU. More grad flow pass the layer and linear in its next layer can get more information.

picture from Geogerbra

SFReLU

full name Soft maxout Funnel Rectified Linear Unit, which is a soft version of FReLU. origin FReLU are in: https://arxiv.org/pdf/2007.11824.pdf
code in: https://github.com/megvii-model/FunnelAct

what's SFReLU?

SFReLU is an easy paramed activation funcion.

f(x0, x1) --> x:
    temp0 = x0 - x1
    temp0 = silu(temp0)
    x = temp0 + x1
    return x

sfrelu(x) --> x:
    y = dwconv(x) # shape same as x
    x = f(x, y)
    return x

their graphs are as follows:

reason

the origin maxout can be described as the function below.

f(x0, x1) --> x:
    temp0 = x0 - x1
    temp0 = relu(temp0)
    x = temp0 + x1
    return x

I simply replace relu with silu, which is considered a "soft" version of relu. you can also call it FSiLU

SoftResial

as the name suggest, it's the soft version of "+", which is also called "reslink", who create the empire of real DEEP learning. and I somehow managed to create a soft version of this. just like everything else, it can be described as this:

y = silu(x0 + x1) - silu(x0 - x1) + x1

that's all, easy

SoftMultiply

soft version of operator "*". when x0 >> x1, return x1 when x0 << x1, return x0 when x0 ~~ x1, return x0 * x1 function:

y = silu(-x0)*silu(-x1) + x0 - silu(x0 - x1)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LICENSE		LICENSE
README.md		README.md
SFReLU.py		SFReLU.py
snake.py		snake.py
softRes.py		softRes.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Snake

Happy Snake Year 2025!

what's Snake?

their graphs are as follows:

reason

SFReLU

what's SFReLU?

their graphs are as follows:

reason

SoftResial

SoftMultiply

About

Uh oh!

Releases

Packages

Languages

License

SnifferCaptain/YActivations

Folders and files

Latest commit

History

Repository files navigation

Snake

Happy Snake Year 2025!

what's Snake?

their graphs are as follows:

reason

SFReLU

what's SFReLU?

their graphs are as follows:

reason

SoftResial

SoftMultiply

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages