Optimize the grid dimensionality during KANLayer initialization to reduce memory/GPU usage significantly and greatly reduce the initialization time of KANLayer. #378

congyue1977 · 2024-07-28T09:03:44Z

In the initialization process of KANLayer, since the knots vector of B-Splines is constructed based on the grid_range parameter, it is identical across all input dimensions (in_dim). This means the data in the grid is redundant, so simply setting the size of the first dimension to 1 suffices. Subsequent calculations will automatically utilize tensor broadcasting and will not affect the grid update process.

This optimization reduces memory or GPU usage significantly. After optimization, each layer of KANLayer can save (in_dim-1) * (G+2k+1) memory. If the depth is N and input dimensions are the same, this can save N*(in_dim-1) * (G+2k+1).

Furthermore, this optimization drastically reduces the initialization time of KANLayer, improving network efficiency. Through testing, with a large G, for example 100, and a width of [4,100,100,100,1] with k=3 for KAN, before optimization, it took nearly 30s to start training on an Intel i9-12900K. After optimization, training starts in less than 1s.

…duce memory/GPU usage significantly and greatly reduce the initialization time of KANLayer. In the initialization process of KANLayer, since the knots vector of B-Splines is constructed based on the grid_range parameter, it is identical across all input dimensions (in_dim). This means the data in the grid is redundant, so simply setting the size of the first dimension to 1 suffices. Subsequent calculations will automatically utilize tensor broadcasting and will not affect the grid update process. This optimization reduces memory or GPU usage significantly. After optimization, each layer of KANLayer can save (in_dim-1) * (G+2k+1) memory. If the depth is N and input dimensions are the same, this can save N*(in_dim-1) * (G+2k+1). Furthermore, this optimization drastically reduces the initialization time of KANLayer, improving network efficiency. Through testing, with a large G, for example 100, and a width of [4,100,100,100,1] with k=3 for KAN, before optimization, it took nearly 30s to start training on an Intel i9-12900K. After optimization, training starts in less than 1s.

…ate grid to run on GPUs.

KindXiaoming and others added 12 commits July 21, 2024 20:36

Update README.md

02f28b4

Update README.md

ac15918

fix least squares

9f0214e

delete models

db08630

Merge branch 'master' of github.com:KindXiaoming/pykan

e463e62

Update setup.py

c132dbe

fix multkan plot

09039c9

delete figs

d3737bc

Merge branch 'master' of github.com:KindXiaoming/pykan

f320f72

Update setup.py

cc79747

Fix the issue of curve2coef not supporting GPUs, thereby enabling upd…

39dc314

…ate grid to run on GPUs.

KindXiaoming force-pushed the master branch from ece97fa to 0227955 Compare August 11, 2024 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize the grid dimensionality during KANLayer initialization to reduce memory/GPU usage significantly and greatly reduce the initialization time of KANLayer. #378

Optimize the grid dimensionality during KANLayer initialization to reduce memory/GPU usage significantly and greatly reduce the initialization time of KANLayer. #378

Uh oh!

congyue1977 commented Jul 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize the grid dimensionality during KANLayer initialization to reduce memory/GPU usage significantly and greatly reduce the initialization time of KANLayer. #378

Are you sure you want to change the base?

Optimize the grid dimensionality during KANLayer initialization to reduce memory/GPU usage significantly and greatly reduce the initialization time of KANLayer. #378

Uh oh!

Conversation

congyue1977 commented Jul 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants