Merge pull request #65 from BiomedSciAI-Innersource/test

DrJStrudwick · GitHub Enterprise · commit 65b80f1bf6d1 · 2024-01-17T15:56:57.000Z
OS Release
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -18,3 +18,18 @@ repos:
       - id: black
         language_version: python3.9
         args: [--line-length=120]
+
+  - repo: https://github.com/ibm/detect-secrets
+    # If you desire to use a specific version of detect-secrets, you can replace `master` with other git revisions such as branch, tag or commit sha.
+    # You are encouraged to use static refs such as tags, instead of branch name
+    #
+    # Running "pre-commit autoupdate" automatically updates rev to latest tag
+    rev: 0.13.1+ibm.61.dss
+    hooks:
+      - id: detect-secrets # pragma: whitelist secret
+        # Add options for detect-secrets-hook binary. You can run `detect-secrets-hook --help` to list out all possible options.
+        # You may also run `pre-commit run detect-secrets` to preview the scan result.
+        # when "--baseline" without "--use-all-plugins", pre-commit scan with just plugins in baseline file
+        # when "--baseline" with "--use-all-plugins", pre-commit scan with all available plugins
+        # add "--fail-on-unaudited" to fail pre-commit for unaudited potential secrets
+        args: [--baseline, .secrets.baseline, --use-all-plugins]
diff --git a/.secrets.baseline b/.secrets.baseline
@@ -0,0 +1,85 @@
+{
+  "exclude": {
+    "files": "^.secrets.baseline$",
+    "lines": null
+  },
+  "generated_at": "2024-01-15T10:01:47Z",
+  "plugins_used": [
+    {
+      "name": "AWSKeyDetector"
+    },
+    {
+      "name": "ArtifactoryDetector"
+    },
+    {
+      "name": "AzureStorageKeyDetector"
+    },
+    {
+      "base64_limit": 4.5,
+      "name": "Base64HighEntropyString"
+    },
+    {
+      "name": "BasicAuthDetector"
+    },
+    {
+      "name": "BoxDetector"
+    },
+    {
+      "name": "CloudantDetector"
+    },
+    {
+      "ghe_instance": "github.ibm.com",
+      "name": "GheDetector"
+    },
+    {
+      "name": "GitHubTokenDetector"
+    },
+    {
+      "hex_limit": 3,
+      "name": "HexHighEntropyString"
+    },
+    {
+      "name": "IbmCloudIamDetector"
+    },
+    {
+      "name": "IbmCosHmacDetector"
+    },
+    {
+      "name": "JwtTokenDetector"
+    },
+    {
+      "keyword_exclude": null,
+      "name": "KeywordDetector"
+    },
+    {
+      "name": "MailchimpDetector"
+    },
+    {
+      "name": "NpmDetector"
+    },
+    {
+      "name": "PrivateKeyDetector"
+    },
+    {
+      "name": "SlackDetector"
+    },
+    {
+      "name": "SoftlayerDetector"
+    },
+    {
+      "name": "SquareOAuthDetector"
+    },
+    {
+      "name": "StripeDetector"
+    },
+    {
+      "name": "TwilioKeyDetector"
+    }
+  ],
+  "results": {},
+  "version": "0.13.1+ibm.62.dss",
+  "word_list": {
+    "file": null,
+    "hash": null
+  }
+}
diff --git a/.travis.yml b/.travis.yml
@@ -22,7 +22,7 @@ services:
 
 env:
   global:
-    - IMAGE_NAME=omixai
+    - IMAGE_NAME=autoxai4omics
 
 # safelist - only work with these github branches
 branches:
diff --git a/.whitesource b/.whitesource
@@ -1,8 +1,6 @@
 {
     "settingsInheritedFrom": "whitesource-config/whitesource-config@issues_none",
     "scanSettings": {
-        "baseBranches": [
-            "DEV"
-        ]
+        "configMode": "LOCAL"
     }
-}
+}
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -80,7 +80,7 @@ Change log for the codebase. Initialised from the developments following version
     - tests_mode grabbing the wrong file at times when lots present
     - Arm fixes for previous security fix
     - test_modes bugfix
-    - name change within code base & repo (from `Auto-Omics` to `OmiXai`)
+    - name change within code base & repo (from `Auto-Omics` to `AutoXAI4Omics`)
     - provided example data & config users can run
     - corrected test_model_outputs
     - added skip conditions for test_omic_datasets
@@ -92,3 +92,5 @@ Change log for the codebase. Initialised from the developments following version
     - tests requiring a container marked and fixture added to build the container
     - `CONTRIBUTING.md` added and info added to `DEV_MANUAL.md`
     - `DEV_MANUAL.md` updated
+    - Detect secrets added
+    - Upgraded python base image from `3.9.14` to `3.9.18` for additional security fixes
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -15,7 +15,7 @@
 -->
 # Contributing
 
-This file provides general guidance for anyone contributing to IBM OmiXai. For technical details on improving the code base please see the `DEV_MANUAL.md`, this contains general information on how to contibute to this repositry.
+This file provides general guidance for anyone contributing to IBM AutoXAI4Omics. For technical details on improving the code base please see the `DEV_MANUAL.md`, this contains general information on how to contibute to this repositry.
 
 ## Branch managment
 
@@ -62,7 +62,7 @@ and "help wanted" is open to whoever wants to implement it.
 
 ### Write Documentation/tests
 
-OmiXai could always use more documentation, whether as part of the
+AutoXAI4Omics could always use more documentation, whether as part of the
 official docs, in docstrings, or more tests to increase the reliability and coverage.
 
 ## Developer's Certificate of Origin 1.1
diff --git a/DEV_MANUAL.md b/DEV_MANUAL.md
@@ -76,3 +76,7 @@ Each subclass should has a `nickname` class attribute, which is the model's alia
 To add a new measure, simply register the function in the dictionary in `src/metrics/metric_defs.py`.
 ​
 The only caveat here is that the sklearn convention is that a higher value is better. This convention is used in the hyperparameter tuning, and so when specifying a loss or an error, then when calling `make_scorer()` then you need to pass `greater_is_better=False`. In this case, the values become negative, so when plotting the absolute value needs to be taken (this can also be done for the .csv results if desired, but is not currently).
+
+## Container security
+
+If you need to have a image with less vunerabilities/other requirments the base image can be changed in the dockerfile to whatever works for your personal requirments. All the only requirement is that `python3.9` is installed
diff --git a/Dockerfile b/Dockerfile
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 # Set base image and key env vars
-FROM python:3.9.14
+FROM python:3.9.18
 # ENV DEBIAN_FRONTEND="noninteractive"
 
 # Default 1001 - non privileged uid
@@ -25,7 +25,7 @@ RUN apt-get update && apt-get upgrade -y && apt-get clean
 RUN apt-get install -y software-properties-common git
 
 # upgrade pip
-RUN python -m pip install --upgrade pip
+RUN python -m pip install --upgrade pip setuptools
 
 # Add omicuser and set env vars
 # Give omicsuser gid 0 so has root group permissions to read files, 
diff --git a/Dockerfile.gpu b/Dockerfile.gpu
diff --git a/README.md b/README.md
@@ -14,9 +14,9 @@
  limitations under the License.
 -->
 
-# OmiXai: an Explainable Auto-AI tool for omics and tabular data
+# Automated Explainable AI for Omics (AutoXAI4Omics): an Explainable Auto-AI tool for omics and tabular data
 
-OmiXai is a command line automated explainable AI tool that easily enable researchers to perform phenotype prediction from omics data (e.g., gene expression; microbiome data; or any tabular data) and any tabular data (e.g., clinical) using a range of ML models.
+AutoXAI4Omics is a command line automated explainable AI tool that easily enable researchers to perform phenotype prediction from omics data (e.g., gene expression; microbiome data; or any tabular data) and any tabular data (e.g., clinical) using a range of ML models.
 
 *Key features include*:
 
@@ -31,13 +31,9 @@ OmiXai is a command line automated explainable AI tool that easily enable resear
 * prediction on new data using the best model
 * packaged as a Docker container
 
-## Important note
-
-This tool is for IBM internal use ONLY.
-
 ## Citation
 
-For general IBM internal use of the tool please cite this article:
+For citation of this tool, please reference this article:
 
 * Carrieri, A.P., Haiminen, N., Maudsley-Barton, S. et al. Explainable AI reveals changes in skin microbiome composition linked to phenotypic differences. Sci Rep 11, 4565 (2021). <https://doi.org/10.1038/s41598-021-83922-6>
 
@@ -49,49 +45,49 @@ For general IBM internal use of the tool please cite this article:
   * installation: `https://github.com/git-guides/install-git`
 * Python 3.9 (only required if the user is planning on contributing to the development of the tool)
 
-## How to install OmiXai
+## How to install AutoXAI4Omics
 
- 1. Clone this repo however you choose (cli command: `git clone --single-branch --branch main git@github.ibm.com:BiomedSciAI-Innersource/OmiXai.git`)
+ 1. Clone this repo however you choose (cli command: `git clone --single-branch --branch main git@github.ibm.com:BiomedSciAI-Innersource/AutoXAI4Omics.git`)
  2. Make sure `docker` is running (cli command: `docker version`, if installed the version information will be given)
- 3. Within the `OmiXai` folder:
+ 3. Within the `AutoXAI4Omics` folder:
        1. Run the following cli command to build the image: `./build.sh -r`
        2. Manually create a new folder called `experiments`
 
 NOTE: if training is run by mistake without first creating the `experiments` directory, and the directory is created while training, the directory needs to be removed and then created again before running training (has to do with access permissions)
 
 ## User manual
 
-Everything is controlled through a config dictionary, examples of which can be found in the `configs/exmaples` folder. For an explanation of all parameters, please see the [***CONFIG MANUAL***](https://github.ibm.com/BiomedSciAI-Innersource/OmiXai/blob/main/configs/CONFIG_MANUAL.md).
+Everything is controlled through a config dictionary, examples of which can be found in the `configs/exmaples` folder. For an explanation of all parameters, please see the [***CONFIG MANUAL***](https://github.ibm.com/BiomedSciAI-Innersource/AutoXAI4Omics/blob/main/configs/CONFIG_MANUAL.md).
 
-The tool is launched in the cli using `omixai.sh` which has multiple flags, examples will be given below:
+The tool is launched in the cli using `autoxai4omics.sh` which has multiple flags, examples will be given below:
 
-* `-m` this specifies what mode you want to run OmiXai in the options are:
+* `-m` this specifies what mode you want to run AutoXAI4Omics in the options are:
   * `feature` - Run feature selection on a input data set
   * `train` - Tune and train various machine learning models, generate plots and results
   * `test` - To test and evaluate the tuned and trained machine learning models on a completely different holdout dataset
   * `predict` - Use trained models to predict on unseen data
   * `plotting` - If the models have been tuned and trained (and therefore saved), the plots and results can be generated in isolation
   * `bash` - Use to open up a bash shell into the tool
-* `-c` this is the filename of the config json within the `OmiXai/configs` folder that is going to be given to OmiXai
+* `-c` this is the filename of the config json within the `AutoXAI4Omics/configs` folder that is going to be given to AutoXAI4Omics
 * `-r` this sets the contain to run as root. Only possibly required if you are running in `bash` mode
 * `-d` this detatches the cli running the container in the background
-* `-g` this specifies if you want OmiXai to use the gpus that are available on the machine (UNDER TESTING)
+* `-g` this specifies if you want AutoXAI4Omics to use the gpus that are available on the machine (UNDER TESTING)
 
-Data to be used by OmiXai needs to be stored in the `OmiXai/data` folder.
+Data to be used by AutoXAI4Omics needs to be stored in the `AutoXAI4Omics/data` folder.
 
 ### Examples
 
-* Run OmiXai in training mode with a config called `my_fun_config.json` within the `configs` folder:
-  * `./omixai.sh -m train -c my_fun_config.json`
+* Run AutoXAI4Omics in training mode with a config called `my_fun_config.json` within the `configs` folder:
+  * `./autoxai4omics.sh -m train -c my_fun_config.json`
 
 * We have provided and example config and dataset that you can run to get going. The components are:
   * config: `configs/examples/50k_barley_SHAP.json`
   * data: `data/geno_row_type_BRIDGE_50k_w.hetero.csv`
   * metadata: `data/row_type_BRIDGE_pheno_50k_metadata_w.hetero.csv`
-  * cli command to run: `./omixai.sh -m train -c examples/50k_barley_SHAP.json`
+  * cli command to run: `./autoxai4omics.sh -m train -c examples/50k_barley_SHAP.json`
 
-* If you wish to run a bash shell within the OmiXai image then you can do it using the following. In addition if you wish to be logged in as root add the `-r` flag:
-  * `./omixai.sh -m bash -r`
+* If you wish to run a bash shell within the AutoXAI4Omics image then you can do it using the following. In addition if you wish to be logged in as root add the `-r` flag:
+  * `./autoxai4omics.sh -m bash -r`
 
-* **UNDER TESTING** If you wish to utilise any gpus that are available on your machine during your OmiXai run then you can add the `-g` flag:
-  * `./omixai.sh -m train -c my_fun_config.json -g`
+* **UNDER DEVELOPMENT** If you wish to utilise any gpus that are available on your machine during your AutoXAI4Omics run then you can add the `-g` flag:
+  * `./autoxai4omics.sh -m train -c my_fun_config.json -g`
diff --git a/autoxai4omics.sh b/autoxai4omics.sh
diff --git a/cicd_scripts/build_docker_images.sh b/cicd_scripts/build_docker_images.sh
@@ -16,7 +16,7 @@
 set -x # switch on
 # set +x # switch off
 
-echo "This is the script to build & push OmiXai Docker Images"
+echo "This is the script to build & push AutoXAI4Omics Docker Images"
 
 echo "IBM Cloud Region: $IBM_CLOUD_REGION"
 echo "Container Registry Region: $REGISTRY_REGION"
diff --git a/common.sh b/common.sh
@@ -14,6 +14,6 @@
 # limitations under the License.
 
 
-IMAGE_NAME=omixai
+IMAGE_NAME=autoxai4omics
 IMAGE_TAG=1.0.0
 IMAGE_FULL=${IMAGE_NAME}:${IMAGE_TAG}
diff --git a/pytest.ini b/pytest.ini
@@ -15,7 +15,7 @@
 [pytest]
 markers = 
     synthetic : test uses synthetic data 
-    container : test is run using against the built omixai container
+    container : test is run using against the built autoxai4omics container
     modes : test is running a main mode
     training : test is related to training 
     holdout : test is related to testing the test holdout
diff --git a/src/structure.md b/src/structure.md
@@ -18,5 +18,5 @@
 
 Here are some note on the structure that we have take for the source code:
 
-- The files at this level are for the modes that OmiXai can run. Any other code should be allocated down to lower subfolders.
+- The files at this level are for the modes that AutoXAI4Omics can run. Any other code should be allocated down to lower subfolders.
 - Exception is the `logging.yml` which is for configuring the logger of the tool
diff --git a/src/utils/utils.py b/src/utils/utils.py
@@ -170,7 +170,7 @@ def setup_logger(experiment_folder):
 
     lg_file["handlers"]["file"]["filename"] = str(
         experiment_folder
-        / f"OmiXaiLog_{str(int(datetime.timestamp(datetime.utcnow())))}.log"
+        / f"AutoXAI4OmicsLog_{str(int(datetime.timestamp(datetime.utcnow())))}.log"
     )
     logging.config.dictConfig(lg_file)
     omicLogger = logging.getLogger("OmicLogger")
diff --git a/tests/test_modes.py b/tests/test_modes.py
diff --git a/whitesource.config b/whitesource.config

Original file line number	Diff line number	Diff line change
`@@ -1,8 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"settingsInheritedFrom": "whitesource-config/whitesource-config@issues_none",`
`3`	`3`	`"scanSettings": {`
`4`		`- "baseBranches": [`
`5`		`- "DEV"`
`6`		`- ]`
	`4`	`+ "configMode": "LOCAL"`
`7`	`5`	`}`
`8`		`-}`
	`6`	`+}`