You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: DEV_MANUAL.md
+12-10Lines changed: 12 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,9 @@ Please use blacks & ruff to format any code contributions, we have a pre-commit
28
28
29
29
## Virtual enviroment
30
30
31
-
To create the virtual enviroment for AutoXAI4Omics using an enviroment manager of your choice, like conda for example, using `python3.9` as your starting point. Then proceed to install the contents of both `requirements_dev.txt` and `requirements.txt`. Note that you may also need to set `PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python` within your enviroment.
31
+
To create the virtual enviroment for AutoXAI4Omics using an enviroment manager of your choice, like conda for example, using `python3.9` as your starting point. Then proceed to install the contents of `pyproject.toml` for both the main and dev dependencies. Note that you may also need to set `PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python` within your enviroment.
32
+
33
+
*Note:* Since May '25 we have switched to using [Poetry](https://python-poetry.org/) for managing our env and dependencies which we also recommend to use. There is also an associated `poetry.lock` file to ensure consistent versions between developers.
32
34
33
35
## Testing
34
36
@@ -40,31 +42,31 @@ We have pytests that can be excuted to make sure the system works. The low level
40
42
41
43
## Adding a new data type
42
44
43
-
A new `data_type` option can be added to the code, with associated specific pre-procrssing steps. The source code for data-specific processing should be stored in its own file in the `src/omics` folder and then called in the `src/utils/load.py` file.
45
+
A new `data_type` option can be added to the code, with associated specific pre-procrssing steps. The source code for data-specific processing should be stored in its own file in the `autoxai4omics/omics` folder and then called in the `autoxai4omics/utils/load.py` file.
44
46
45
47
## Adding a new plot
46
48
47
-
In `src/plotting/plotting_utils.py`, the `define_plots()` function at the top specifies which plotting functions are available for regression and classification problems. Some plots are applicable to both, so add the alias (which is used in the config file) and the function object itself to the relevant dictionary (or -ies).
49
+
In `autoxai4omics/plotting/plotting_utils.py`, the `define_plots()` function at the top specifies which plotting functions are available for regression and classification problems. Some plots are applicable to both, so add the alias (which is used in the config file) and the function object itself to the relevant dictionary (or -ies).
48
50
49
-
The code for plots that are applicable to both Regression and Classification problems are stored in `src/plotting/plots_both.py`, problem specific plots are stored in the respective `src/plotting/plots_reg.py` and `src/plotting/plots_clf.py`. The exception being the code for the shap plots and the permutation importance plots, these are contained in their own subfolders within `src/plotting`
51
+
The code for plots that are applicable to both Regression and Classification problems are stored in `autoxai4omics/plotting/plots_both.py`, problem specific plots are stored in the respective `autoxai4omics/plotting/plots_reg.py` and `autoxai4omics/plotting/plots_clf.py`. The exception being the code for the shap plots and the permutation importance plots, these are contained in their own subfolders within `autoxai4omics/plotting`
50
52
51
-
The function itself then needs to be added to the `plot_graphs()` function in `src/mode_plotting.py` with the relevant arguments. Some functions have been duplicated here with different arguments for easy access via the alias (allowing multiple calls to the same function from a single config file call).
53
+
The function itself then needs to be added to the `plot_graphs()` function in `autoxai4omics/mode_plotting.py` with the relevant arguments. Some functions have been duplicated here with different arguments for easy access via the alias (allowing multiple calls to the same function from a single config file call).
52
54
53
55
For plots that load a Tensorflow or Keras model, after that model is used you will need to call `K.clear_session()` to ensure that there is no lingering session or graph. This is called after every plot function, but when loading multiple Tensorflow models this will need to be called inside the plotting function.
54
56
55
-
All plotting functions have a save argument to allow plots to be shown on the screen or saved, though this defaults to `True`. For uniform parameters, when saving use the `save_fig()` function, from `src/utils/save.py`, that calls the usual `fig.savefig` function in matplotlib. When loading models, do this through the `src.utils.load.load_model()` function. For defining the saving and loading for a _CustomModel_, see the section below about adding models.
57
+
All plotting functions have a save argument to allow plots to be shown on the screen or saved, though this defaults to `True`. For uniform parameters, when saving use the `save_fig()` function, from `autoxai4omics/utils/save.py`, that calls the usual `fig.savefig` function in matplotlib. When loading models, do this through the `autoxai4omics.utils.load.load_model()` function. For defining the saving and loading for a *CustomModel*, see the section below about adding models.
56
58
57
-
If the model has a useful hook to SHAP e.g. via the _TreeExplainer_, then make sure it is added in `src.plotting.shap.plots_shap.select_explainer()`.
59
+
If the model has a useful hook to SHAP e.g. via the *TreeExplainer*, then make sure it is added in `autoxai4omics.plotting.shap.plots_shap.select_explainer()`.
58
60
59
61
60
62
## Adding a new model
61
63
62
-
To add a new model, the parameter definitions need to added to `src/models/model_params.py`, which has separate dictionaries for parameter definitions for grid or random search, as well as a single model. Similarly, new models need to be added to the `MODELS` dict in `src/models/model_defs.py`, which connects the input name to the model and the default paramters defined in `src/models/model_params.py`
64
+
To add a new model, the parameter definitions need to added to `autoxai4omics/models/model_params.py`, which has separate dictionaries for parameter definitions for grid or random search, as well as a single model. Similarly, new models need to be added to the `MODELS` dict in `autoxai4omics/models/model_defs.py`, which connects the input name to the model and the default paramters defined in `autoxai4omics/models/model_params.py`
63
65
64
66
65
67
### CustomModel
66
68
67
-
In addition to the above, if the model is not part of scikit-learn, then it can be added as a subclass of the _CustomModel_ class (in `src/models/custom_model.py`). The methods of the base class show what needs to be defined in order for it to behave similarly to a sklearn model.
69
+
In addition to the above, if the model is not part of scikit-learn, then it can be added as a subclass of the *CustomModel* class (in `autoxai4omics/models/custom_model.py`). The methods of the base class show what needs to be defined in order for it to behave similarly to a sklearn model.
68
70
69
71
The key things to keep in mind are a way to save and load models, which may require temporarily deleting attributes that cannot be pickled e.g. a Tensorflow graph. Thus, when loading, these attributes will need to be added back in, e.g. by defining the graph again. If you encounter errors, first look at how the other subclasses (`MLPEnsemble` wrapping Tensorflow, and `MLPKeras` wrapping Keras) handled it.
70
72
@@ -73,7 +75,7 @@ Each subclass should has a `nickname` class attribute, which is the model's alia
73
75
74
76
## Adding a new scorer
75
77
76
-
To add a new measure, simply register the function in the dictionary in `src/metrics/metric_defs.py`.
78
+
To add a new measure, simply register the function in the dictionary in `autoxai4omics/metrics/metric_defs.py`.
77
79
78
80
The only caveat here is that the sklearn convention is that a higher value is better. This convention is used in the hyperparameter tuning, and so when specifying a loss or an error, then when calling `make_scorer()` then you need to pass `greater_is_better=False`. In this case, the values become negative, so when plotting the absolute value needs to be taken (this can also be done for the .csv results if desired, but is not currently).
Copy file name to clipboardExpand all lines: README.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,7 +41,7 @@ AutoXAI4Omics is a command line automated explainable AI tool that easily enable
41
41
42
42
## How to install AutoXAI4Omics
43
43
44
-
1. Clone this repo however you choose (cli command: `git clone --single-branch --branch main git@github.ibm.com:BiomedSciAI-Innersource/AutoXAI4Omics.git`)
44
+
1. Clone this repo however you choose (cli command: `git clone --single-branch --branch main [email protected]:IBM/AutoXAI4Omics.git`)
45
45
2. Make sure `docker` is running (cli command: `docker version`, if installed the version information will be given)
46
46
3. Within the `AutoXAI4Omics` folder:
47
47
1. Run the following cli command to build the image: `./build.sh -r`
@@ -56,11 +56,14 @@ AutoXAI4Omics is a command line automated explainable AI tool that easily enable
56
56
57
57
For citation of this tool, please reference this article:
58
58
59
-
* James Strudwick, Laura-Jayne Gardiner, Kate Denning-James, Niina Haiminen, Ashley Evans, Jennifer Kelly, Matthew Madgwick, Filippo Utro, Ed Seabolt, Christopher Gibson, Bharat Bedi, Daniel Clayton, Ciaron Howell, Laxmi Parida, Anna Paola Carrieri. bioRxiv 2024.03.25.586460; doi: <https://doi.org/10.1101/2024.03.25.586460>
59
+
* James Strudwick, Laura-Jayne Gardiner, Kate Denning-James, Niina Haiminen, Ashley Evans, Jennifer Kelly, Matthew Madgwick, Filippo Utro, Ed Seabolt, Christopher Gibson, Bharat Bedi, Daniel Clayton, Ciaron Howell, Laxmi Parida, Anna Paola Carrieri. doi: <https://doi.org/10.1093/bib/bbae593>
60
+
<!-- bioRxiv 2024.03.25.586460; -->
61
+
62
+
**NOTE** The configs, data and results published with the paper were produced using version `1.0.0` of the tool. If you wish to reproduced the results please make sure you pull the correct version. Otherwise you will need to update the configs to account for the improvements that have been made in subsequent versions since the initial release.
60
63
61
64
## User manual
62
65
63
-
Everything is controlled through a config dictionary, examples of which can be found in the `configs/exmaples` folder. For an explanation of all parameters, please see the [***CONFIG MANUAL***](https://github.ibm.com/BiomedSciAI-Innersource/AutoXAI4Omics/blob/main/configs/CONFIG_MANUAL.md).
66
+
Everything is controlled through a config dictionary, examples of which can be found in the `configs/exmaples` folder. For an explanation of all parameters, please see the [***CONFIG MANUAL***](https://github.com/IBM/AutoXAI4Omics/blob/main/DEV_MANUAL.md).
64
67
65
68
The tool is launched in the cli using `autoxai4omics.sh` which has multiple flags, examples will be given below:
0 commit comments