Description
Answering a discussion started in #205. We can also discuss in person, but I will try to explain ahead of that.
As it is now, the same variables (mostly s_
sums) are listed several times: in multiple keep/drop statements, as well as in their "definition".
This repetition makes it easier to introduce mistakes accidentally, such as by forgetting to add the variable in all the necessary places. These are also quite long lists, which take up a big chunk of the model without affecting the behaviour much!
It should be possible to remove this duplication by using the %include
statement to read them from a separate file.
This would mean that:
- we will be less likely to only add the variable in one place and forget it in the others
- when we catch a mistake like this, we only need to fix it in one place (the external file)
- we can enforce checks on the file to catch mistakes more easily, such as ensuring that variables are not mentioned twice (this is currently hard because we would need to extract the list of variables from the main file)
- we separate the main parts of the model from the "boilerplate" text (e.g. the
keep
statements), which simplifies the navigation of the model
The above is something of a simplification: the list of variables is not always the same -- for example, the keep
statements do differ, and they include more than just the s_
variables. In reality, we may need two or three files (e.g. "initial variabels to be kept", "final variables to be kept", "summable variables"), but the main idea would be the same.
Before doing this, we should have a way of checking that it doesn't change any output. So, if agreed, this wouldn't be attempted before completing the changes for running with a fixed seed (#194).
We should also make sure that the lists of variables are only read in once per simulation, to maintain performance (but that should be easy).