Why do we use MaskedLinear for the condition? #70
-
DescriptionIt seems that in zuko we just cat features and condition and then use a MaskedLinear to handle them together (please correct me if I missed something). What if we use MaskedLinear only for features and a plain Linear layer for handling the condition? zuko/zuko/flows/autoregressive.py Lines 207 to 218 in 25fefe2 where the hyper net is zuko/zuko/flows/autoregressive.py Line 152 in 25fefe2 ImplementationThe implementation would be like Thanks in advance |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hi @yangysc, sorry for the delay, I was very busy with deadlines and my thesis. The hyper-network If the hyper-network was a single Please tell me if you have any other questions. |
Beta Was this translation helpful? Give feedback.
Hi @yangysc, sorry for the delay, I was very busy with deadlines and my thesis.
The hyper-network$\phi_i$ of the transformation $y_i = f(x_i; \phi_i)$ only dependent on preceding features $x_{<i}$ and the context $c$ . This is done with a series of masks that depend on the ordering of the variables.
self.hyperis aMaskedMLP. The goal of this network is to make the parametersIf the hyper-network was a single$\phi_i$ to be a non-linear combination of $x_{<i}$ and $c$ . Therefore, after the first layer we have to use
MaskedLinearlayer, then what you propose would have (almost) worked (it would beMaskedLinear(features) + Linear(context)). However, we wantMaskedLinearlayers only.P…