Skip to content

Commit e9f146f

Browse files
authored
v1.2.0 (#40) (#41)
* Bugfix: read batch input in autoxai4omics.sh * 7 eli5 to shap (#28) * Changed: Eli5 & scikit learn bump * Changed: converted print statements to logger and send logging to stdout * Fixed: missed loggers statement * Changed: upgraded `xgboost` and `shap` to latest version * Updated: Changelog * Changed: upgraded eli5 * 29 change env manager to poetry (#34) * Changed: updated gitignore * Changed: tweaked config duplication naming * Changed: renamed `src` to `autoxai4omics` * Added: initial pyproject.toml and lock file * Change: prelim docker image update * Changed: added platform specific dependencies * Changed: updated dockerfile * Changed: updated dev manual * Removed: removed depricated requirements txt * Changed: updated changelog * Changed: Author name * 30 upgrade python version used (#35) * Changed: updated python version in dockerfile * Changed: required package changes for python 3.11 * 38 documentation update (#39) * Fixed: correct pull command statement * Fixed: corrected dev manual link * Fixed: updated doi * Added: warning about version to use with published data & configs * Removed: biorxiv doi * version bump
1 parent 9593f9e commit e9f146f

File tree

99 files changed

+5790
-251
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

99 files changed

+5790
-251
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,4 @@ playground.ipynb
1616
coverage.xml
1717
.coverage
1818
ao-env
19+
dev_scripts

.python-version

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
3.9
2+
3.10
3+
3.11

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,18 @@
1818

1919
Change log for the codebase. Initialised from the developments following version `V0.11.3`
2020

21+
## [unreleased] - 2025-05-09
22+
23+
### Changed
24+
25+
- Changed logger to also send to stdout and converted more print statements to loggers.
26+
- Upgraded:
27+
- Eli5
28+
- scikit-learn (also resolves dependabot alert)
29+
- xgboost
30+
- shap
31+
- Changed env manager to be poetry based
32+
2133
## [v1.1.1] - 2025-03-04
2234

2335
### Fixed

DEV_MANUAL.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,9 @@ Please use blacks & ruff to format any code contributions, we have a pre-commit
2828

2929
## Virtual enviroment
3030

31-
To create the virtual enviroment for AutoXAI4Omics using an enviroment manager of your choice, like conda for example, using `python3.9` as your starting point. Then proceed to install the contents of both `requirements_dev.txt` and `requirements.txt`. Note that you may also need to set `PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python` within your enviroment.
31+
To create the virtual enviroment for AutoXAI4Omics using an enviroment manager of your choice, like conda for example, using `python3.9` as your starting point. Then proceed to install the contents of `pyproject.toml` for both the main and dev dependencies. Note that you may also need to set `PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python` within your enviroment.
32+
33+
*Note:* Since May '25 we have switched to using [Poetry](https://python-poetry.org/) for managing our env and dependencies which we also recommend to use. There is also an associated `poetry.lock` file to ensure consistent versions between developers.
3234

3335
## Testing
3436

@@ -40,31 +42,31 @@ We have pytests that can be excuted to make sure the system works. The low level
4042

4143
## Adding a new data type
4244

43-
A new `data_type` option can be added to the code, with associated specific pre-procrssing steps. The source code for data-specific processing should be stored in its own file in the `src/omics` folder and then called in the `src/utils/load.py` file.
45+
A new `data_type` option can be added to the code, with associated specific pre-procrssing steps. The source code for data-specific processing should be stored in its own file in the `autoxai4omics/omics` folder and then called in the `autoxai4omics/utils/load.py` file.
4446

4547
## Adding a new plot
4648

47-
In `src/plotting/plotting_utils.py`, the `define_plots()` function at the top specifies which plotting functions are available for regression and classification problems. Some plots are applicable to both, so add the alias (which is used in the config file) and the function object itself to the relevant dictionary (or -ies).
49+
In `autoxai4omics/plotting/plotting_utils.py`, the `define_plots()` function at the top specifies which plotting functions are available for regression and classification problems. Some plots are applicable to both, so add the alias (which is used in the config file) and the function object itself to the relevant dictionary (or -ies).
4850
49-
The code for plots that are applicable to both Regression and Classification problems are stored in `src/plotting/plots_both.py`, problem specific plots are stored in the respective `src/plotting/plots_reg.py` and `src/plotting/plots_clf.py`. The exception being the code for the shap plots and the permutation importance plots, these are contained in their own subfolders within `src/plotting`
51+
The code for plots that are applicable to both Regression and Classification problems are stored in `autoxai4omics/plotting/plots_both.py`, problem specific plots are stored in the respective `autoxai4omics/plotting/plots_reg.py` and `autoxai4omics/plotting/plots_clf.py`. The exception being the code for the shap plots and the permutation importance plots, these are contained in their own subfolders within `autoxai4omics/plotting`
5052

51-
The function itself then needs to be added to the `plot_graphs()` function in `src/mode_plotting.py` with the relevant arguments. Some functions have been duplicated here with different arguments for easy access via the alias (allowing multiple calls to the same function from a single config file call).
53+
The function itself then needs to be added to the `plot_graphs()` function in `autoxai4omics/mode_plotting.py` with the relevant arguments. Some functions have been duplicated here with different arguments for easy access via the alias (allowing multiple calls to the same function from a single config file call).
5254
5355
For plots that load a Tensorflow or Keras model, after that model is used you will need to call `K.clear_session()` to ensure that there is no lingering session or graph. This is called after every plot function, but when loading multiple Tensorflow models this will need to be called inside the plotting function.
5456
55-
All plotting functions have a save argument to allow plots to be shown on the screen or saved, though this defaults to `True`. For uniform parameters, when saving use the `save_fig()` function, from `src/utils/save.py`, that calls the usual `fig.savefig` function in matplotlib. When loading models, do this through the `src.utils.load.load_model()` function. For defining the saving and loading for a _CustomModel_, see the section below about adding models.
57+
All plotting functions have a save argument to allow plots to be shown on the screen or saved, though this defaults to `True`. For uniform parameters, when saving use the `save_fig()` function, from `autoxai4omics/utils/save.py`, that calls the usual `fig.savefig` function in matplotlib. When loading models, do this through the `autoxai4omics.utils.load.load_model()` function. For defining the saving and loading for a *CustomModel*, see the section below about adding models.
5658
57-
If the model has a useful hook to SHAP e.g. via the _TreeExplainer_, then make sure it is added in `src.plotting.shap.plots_shap.select_explainer()`.
59+
If the model has a useful hook to SHAP e.g. via the *TreeExplainer*, then make sure it is added in `autoxai4omics.plotting.shap.plots_shap.select_explainer()`.
5860
5961

6062
## Adding a new model
6163

62-
To add a new model, the parameter definitions need to added to `src/models/model_params.py`, which has separate dictionaries for parameter definitions for grid or random search, as well as a single model. Similarly, new models need to be added to the `MODELS` dict in `src/models/model_defs.py`, which connects the input name to the model and the default paramters defined in `src/models/model_params.py`
64+
To add a new model, the parameter definitions need to added to `autoxai4omics/models/model_params.py`, which has separate dictionaries for parameter definitions for grid or random search, as well as a single model. Similarly, new models need to be added to the `MODELS` dict in `autoxai4omics/models/model_defs.py`, which connects the input name to the model and the default paramters defined in `autoxai4omics/models/model_params.py`
6365
6466

6567
### CustomModel
6668

67-
In addition to the above, if the model is not part of scikit-learn, then it can be added as a subclass of the _CustomModel_ class (in `src/models/custom_model.py`). The methods of the base class show what needs to be defined in order for it to behave similarly to a sklearn model.
69+
In addition to the above, if the model is not part of scikit-learn, then it can be added as a subclass of the *CustomModel* class (in `autoxai4omics/models/custom_model.py`). The methods of the base class show what needs to be defined in order for it to behave similarly to a sklearn model.
6870
6971
The key things to keep in mind are a way to save and load models, which may require temporarily deleting attributes that cannot be pickled e.g. a Tensorflow graph. Thus, when loading, these attributes will need to be added back in, e.g. by defining the graph again. If you encounter errors, first look at how the other subclasses (`MLPEnsemble` wrapping Tensorflow, and `MLPKeras` wrapping Keras) handled it.
7072
@@ -73,7 +75,7 @@ Each subclass should has a `nickname` class attribute, which is the model's alia
7375

7476
## Adding a new scorer
7577

76-
To add a new measure, simply register the function in the dictionary in `src/metrics/metric_defs.py`.
78+
To add a new measure, simply register the function in the dictionary in `autoxai4omics/metrics/metric_defs.py`.
7779
7880
The only caveat here is that the sklearn convention is that a higher value is better. This convention is used in the hyperparameter tuning, and so when specifying a loss or an error, then when calling `make_scorer()` then you need to pass `greater_is_better=False`. In this case, the values become negative, so when plotting the absolute value needs to be taken (this can also be done for the .csv results if desired, but is not currently).
7981

Dockerfile

Lines changed: 18 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -13,35 +13,29 @@
1313
# limitations under the License.
1414

1515
# Set base image and key env vars
16-
FROM python:3.9.18
17-
# ENV DEBIAN_FRONTEND="noninteractive"
18-
19-
# Default 1001 - non privileged uid
20-
ARG USER_ID=1001
21-
ENV TF_CPP_MIN_LOG_LEVEL '2'
22-
ENV PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION 'python'
23-
# Upgrade installed packages
16+
FROM python:3.11.12
2417
RUN apt-get update && apt-get upgrade -y && apt-get clean
25-
RUN apt-get install -y software-properties-common git
2618

27-
# upgrade pip
28-
RUN python -m pip install --upgrade pip setuptools
29-
30-
# Add omicuser and set env vars
31-
# Give omicsuser gid 0 so has root group permissions to read files,
32-
# and is the same gid as Openshift users. Compatible with Openshift and k8s
19+
ENV TF_CPP_MIN_LOG_LEVEL='2'
20+
ENV PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION='python'
21+
ARG USER_ID=1001
3322
RUN useradd -l -m -s /bin/bash --uid ${USER_ID} -g 0 omicsuser
3423

24+
ENV POETRY_NO_INTERACTION=1 \
25+
POETRY_VIRTUALENVS_CREATE=false \
26+
POETRY_HOME='/usr/local' \
27+
POETRY_VERSION=2.1.3
28+
29+
RUN curl -sSL https://install.python-poetry.org | python3 -
3530
WORKDIR /home/omicsuser
3631

37-
# Install required Python packages use block below if fixing other packages for the first time, use other
38-
COPY requirements.txt .
39-
RUN pip install -r requirements.txt
4032

41-
# grant write permissions to these folders
42-
COPY --chown=omicsuser:0 src .
4333

44-
# Use 'omicsuser' user - this is overruled in Openshift
45-
USER omicsuser
46-
# init run command
47-
CMD ["$@"]
34+
COPY --chown=omicsuser:0 poetry.lock pyproject.toml ./
35+
COPY --chown=omicsuser:0 autoxai4omics .
36+
37+
RUN poetry env use system
38+
RUN poetry install --no-root
39+
USER omicsuser
40+
41+
CMD ["$@"]

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ AutoXAI4Omics is a command line automated explainable AI tool that easily enable
4141

4242
## How to install AutoXAI4Omics
4343

44-
1. Clone this repo however you choose (cli command: `git clone --single-branch --branch main git@github.ibm.com:BiomedSciAI-Innersource/AutoXAI4Omics.git`)
44+
1. Clone this repo however you choose (cli command: `git clone --single-branch --branch main [email protected]:IBM/AutoXAI4Omics.git`)
4545
2. Make sure `docker` is running (cli command: `docker version`, if installed the version information will be given)
4646
3. Within the `AutoXAI4Omics` folder:
4747
1. Run the following cli command to build the image: `./build.sh -r`
@@ -56,11 +56,14 @@ AutoXAI4Omics is a command line automated explainable AI tool that easily enable
5656

5757
For citation of this tool, please reference this article:
5858

59-
* James Strudwick, Laura-Jayne Gardiner, Kate Denning-James, Niina Haiminen, Ashley Evans, Jennifer Kelly, Matthew Madgwick, Filippo Utro, Ed Seabolt, Christopher Gibson, Bharat Bedi, Daniel Clayton, Ciaron Howell, Laxmi Parida, Anna Paola Carrieri. bioRxiv 2024.03.25.586460; doi: <https://doi.org/10.1101/2024.03.25.586460>
59+
* James Strudwick, Laura-Jayne Gardiner, Kate Denning-James, Niina Haiminen, Ashley Evans, Jennifer Kelly, Matthew Madgwick, Filippo Utro, Ed Seabolt, Christopher Gibson, Bharat Bedi, Daniel Clayton, Ciaron Howell, Laxmi Parida, Anna Paola Carrieri. doi: <https://doi.org/10.1093/bib/bbae593>
60+
<!-- bioRxiv 2024.03.25.586460; -->
61+
62+
**NOTE** The configs, data and results published with the paper were produced using version `1.0.0` of the tool. If you wish to reproduced the results please make sure you pull the correct version. Otherwise you will need to update the configs to account for the improvements that have been made in subsequent versions since the initial release.
6063

6164
## User manual
6265

63-
Everything is controlled through a config dictionary, examples of which can be found in the `configs/exmaples` folder. For an explanation of all parameters, please see the [***CONFIG MANUAL***](https://github.ibm.com/BiomedSciAI-Innersource/AutoXAI4Omics/blob/main/configs/CONFIG_MANUAL.md).
66+
Everything is controlled through a config dictionary, examples of which can be found in the `configs/exmaples` folder. For an explanation of all parameters, please see the [***CONFIG MANUAL***](https://github.com/IBM/AutoXAI4Omics/blob/main/DEV_MANUAL.md).
6467

6568
The tool is launched in the cli using `autoxai4omics.sh` which has multiple flags, examples will be given below:
6669

_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@
1313
# limitations under the License.
1414

1515
# current version of the tool
16-
__version__ = "1.1.1"
16+
__version__ = "1.2.0"

autoxai4omics.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ VOL_MAPS="-v ${PWD}/configs:/configs -v ${PWD}/data:/data -v ${PWD}/experiments:
2424

2525
echo "Getting flags"
2626
#get variables from input
27-
while getopts 'm:c:rgdn' OPTION; do
27+
while getopts 'm:c:rgdn:' OPTION; do
2828
case "$OPTION" in
2929
m)
3030
case "${OPTARG}" in
@@ -148,4 +148,4 @@ else
148148
$VOL_MAPS \
149149
$IMAGE_FULL \
150150
bash
151-
fi
151+
fi
Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# Copyright 2024 IBM Corp.
2-
#
2+
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
55
# You may obtain a copy of the License at
6-
#
6+
#
77
# http://www.apache.org/licenses/LICENSE-2.0
8-
#
8+
#
99
# Unless required by applicable law or agreed to in writing, software
1010
# distributed under the License is distributed on an "AS IS" BASIS,
1111
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@@ -15,16 +15,21 @@
1515
version: 1
1616
formatters:
1717
simple:
18-
format: '%(name)s - %(asctime)s - %(filename)s - %(funcName)s() - %(levelname)s : %(message)s'
18+
format: "%(name)s - %(asctime)s - %(filename)s - %(funcName)s() - %(levelname)s : %(message)s"
1919
handlers:
2020
file:
2121
class: logging.FileHandler
2222
level: DEBUG
2323
formatter: simple
2424
filename:
25-
mode: 'a'
25+
mode: "a"
26+
console:
27+
class: logging.StreamHandler
28+
formatter: simple
29+
level: INFO
30+
stream: ext://sys.stdout
2631
loggers:
2732
OmicLogger:
2833
level: DEBUG
29-
handlers: [file]
30-
propagate: no
34+
handlers: [file, console]
35+
propagate: no

0 commit comments

Comments
 (0)