-
Notifications
You must be signed in to change notification settings - Fork 214
Description
Is your feature request related to a problem? Please describe.
CPU consumption versus modeling performances will always be a tricky balance for NAM. Some people find that the wavenet models consume too much resources.
Describe the solution you'd like
A potential low hanging fruit on wavenet and convnet models is to replace some of the convolutions with depthwise convolutions. These reduce significantly the number of parameters of each convolution and the number of operations.
Describe alternatives you've considered
None.
Additional context
I Made some early tests on two of my captures. One on a clean amp, and another one on high-gain settings.
The change in the training code is very small:
diff --git a/nam/models/wavenet.py b/nam/models/wavenet.py
index b48e41e..cd03b79 100644
--- a/nam/models/wavenet.py
+++ b/nam/models/wavenet.py
@@ -63,7 +63,7 @@ class _Layer(nn.Module):
super().__init__()
# Input mixer takes care of the bias
mid_channels = 2 * channels if gated else channels
- self._conv = Conv1d(channels, mid_channels, kernel_size, dilation=dilation)
+ self._conv = Conv1d(channels, mid_channels, kernel_size, dilation=dilation, groups=channels)
# Custom init: favors direct input-output
# self._conv.weight.data.zero_()
self._input_mixer = Conv1d(condition_size, mid_channels, 1, bias=False)
On the base wavenet architeture, the number of parameters drops from 13.8k to 4.9k. After 1000 training epochs on my settings I have the following results:
- clean, base wavenet. ESR: 0.0007998017827048898
- clean, depthwise wavenet. ESR: 0.000936342345084995
- high-gain, base wavenet. ESR: 0.002280422719195485
- high-gain, depthwise wavenet. ESR: 0.003808232955634594
As you can see, the drop in performance is quite low. However, I could not test/listen to the resulting models because I did not update the core C++ operations. If there is interest in this change, I may contribute.