-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
I’m using the latest commit and trying to reproduce the results written in the paper for various models. However, I noticed that the results don't match. Also after some code reading, I noticed that the provided scripts under scripts/ for PatchTST and TSMixer do not seem to match the original papers official implementations (at least in default configs and some model architectural choices). I’m opening this issue mainly to confirm whether these deviations are intentional (i.e., a simplified/benchmarking variant) or unintended.
For PatchTST, in the official implementations, dropout is often exposed separately for different components (e.g., head / FFN / attention), but in this repo it appears to be a single unified dropout value. More broadly, the PatchTST setup in scripts/ looks structurally/config-wise different from the official author repo defaults.
For TSMixer, the scripts/configs for TSMixer do not match the paper/official defaults for some key hyperparameters (e.g., dropout rate, number of encoder layers). Also lr schedule settings (lradj) also seem different from what the official code/paper describes.
I’d kindly ask if these deviations are intended variants. If they are unintended, would you be willing to update the scripts/configs to align with the official implementation/paper settings?
Thanks!