Skip to content

Commit 74d22dd

Browse files
committed
Merge branch 'gen-2230-refactor_update_table' of github.com:Sage-Bionetworks/Genie into gen-2230-refactor_update_table
merge upstream changes
2 parents 8eeca3c + 7b9fee9 commit 74d22dd

File tree

6 files changed

+355
-19
lines changed

6 files changed

+355
-19
lines changed

.devcontainer/devcontainer.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"image": "sagebionetworks/genie:latest",
2+
"image": "ghcr.io/sage-bionetworks/genie:develop",
33
"mounts": [
44
"source=${localEnv:HOME}/.synapseConfig,target=/root/.synapseConfig,type=bind,consistency=cached"
55
]

.github/workflows/README.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# Github Workflows
2+
3+
## ci.yml
4+
5+
This is our main CI/CD workflow
6+
7+
### Workflow Diagram
8+
9+
The following diagram shows the steps in our `ci.yml` github workflow.
10+
11+
```mermaid
12+
---
13+
title: GitHub Actions Build Workflow (with Conditions)
14+
config:
15+
theme: "default"
16+
---
17+
18+
flowchart TD
19+
A([Trigger Event])
20+
subgraph Triggers
21+
direction TB
22+
PR["Pull Request with label:<br/>'run_integration_tests'<br/>and open state"]
23+
PUSH["Push to main<br/>or develop"]
24+
REL["Release event"]
25+
end
26+
27+
A --> PR
28+
A --> PUSH
29+
A --> REL
30+
31+
subgraph Jobs Order
32+
direction TB
33+
DET["determine-changes"]
34+
TEST["test"]
35+
LINT["lint"]
36+
BUILD["build-container"]
37+
INTEG["integration-tests"]
38+
DEPLOY["deploy"]
39+
end
40+
41+
%% determine-changes
42+
PR -. "Runs if PR has label\nand is open" .-> DET
43+
PUSH -. "Runs if push to main/develop" .-> DET
44+
REL -. "Runs on release" .-> DET
45+
46+
%% test
47+
DET -- "Always needed (runs in parallel)" --> TEST
48+
49+
%% lint
50+
DET -- "Always needed (runs in parallel)" --> LINT
51+
52+
%% build-container
53+
TEST -.-> BUILD
54+
LINT -.-> BUILD
55+
BUILD -. "Always, except PR events with incomplete jobs" .-> BUILD
56+
57+
%% integration-tests
58+
BUILD -.-> INTEG
59+
LINT -.-> INTEG
60+
TEST -.-> INTEG
61+
DET -. "Needs changes outside tests/" .-> INTEG
62+
63+
%% deploy job only for release
64+
BUILD -.-> DEPLOY
65+
LINT -.-> DEPLOY
66+
TEST -.-> DEPLOY
67+
REL -. "Only for release event" .-> DEPLOY
68+
69+
%% Job notes
70+
INTEG:::cond_note
71+
DEPLOY:::cond_note
72+
73+
classDef cond_note fill:#fffae1,stroke:#aaa,stroke-width:1px,color:#b06b00;
74+
class INTEG,DEPLOY cond_note;
75+
76+
%% Legend
77+
subgraph Legend [Legend]
78+
direction LR
79+
note1(( )):::cond_note
80+
note2["Job only runs for special condition"]
81+
end
82+
```
83+
84+
### High Level Overview
85+
86+
This workflow will install Python dependencies, run the CI/CD with a single version of Python
87+
88+
For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
89+
90+
Only triggers the `lint`, `tests`, `build-container`, `determine-changes` and `integration-tests` jobs for the following conditions:
91+
92+
- WHEN the PR has the github label `run_integration_tests` AND commits during when pull request is open
93+
- WHEN it's a push into main or develop branches
94+
- WHEN it's a release event
95+
96+
#### integration-tests
97+
98+
- In addition, the `integration-tests` job only runs when determine-changes job determines that there are non-unit tests changes
99+
100+
#### deploy
101+
102+
- deploy job only runs when there is a release event
103+
104+
105+
## automate_truststore.yml
106+
107+
This workflow will update the Genome nexus truststore file on a schedule and then run the mutation processing AND consortium release steps of the pipeline to make sure the pipeline is working with the new truststore.
108+
109+
The truststore used in our pipeline to run the genome nexus annotator on MAF data.
110+
111+
For more background on the truststore and troubleshooting, please see:
112+
[Updating Genome Nexus Annotator and Dependencies](https://sagebionetworks.jira.com/wiki/spaces/APGD/pages/3016687662/Updating+Genome+Nexus+Annotator+and+Dependencies#Updating-the-trust-ssl-file)
113+
114+
115+
## build_genome_nexus_annotator.yml
116+
117+
This workflow will update and build the Genome Nexus `annotator.jar` file from a
118+
user inputted param: `commit_hash`. This `commit_hash` value comes from pulling from a specific commit in the [genome-nexus-annotation-pipeline](github.com/genome-nexus/genome-nexus-annotation-pipeline).
119+
120+
This workflow runs **on demand** [through triggering the workflow via a pull request to either modify the commit hash or updating something else](https://github.com/Sage-Bionetworks/Genie/actions/workflows/build_genome_nexus_annotator.yml).
121+
122+
After a successful build and upload to the [Genome Nexus Testing folder on synapse](https://www.synapse.org/Synapse:syn70781006), it will run the mutation processing step of the pipeline to annotate the test data. The manual part will be checking the annotated test data after this to make sure the results are expected.
123+
124+
For more background on the `annotator.jar`, validating the annotated results after the build and troubleshooting, please see:
125+
[Updating Genome Nexus Annotator and Dependencies](https://sagebionetworks.jira.com/wiki/spaces/APGD/pages/3016687662/Updating+Genome+Nexus+Annotator+and+Dependencies#Updating-the-annotator.jar)
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
name: Build and Upload Genome Nexus Annotator
2+
3+
on:
4+
push:
5+
paths:
6+
- '.github/workflows/build_genome_nexus_annotator.yml'
7+
8+
env:
9+
# versions for dependencies used in annotator.jar build
10+
MAVEN_VERSION: '3.9.9'
11+
COMMIT_HASH: "2e67ebd08cf7c26bf1f55f2baf4b73ac36531119"
12+
JAVA_VERSION: "21"
13+
JAVA_DISTRIBUTION: "corretto"
14+
# synapse id of the Genome Nexus Testing folder
15+
OUTPUT_FOLDER_SYNAPSE_ID: syn70781006
16+
# env vars related to testing the annotator.jar
17+
PROD_DOCKER: ghcr.io/sage-bionetworks/genie:main
18+
SYNAPSE_AUTH_TOKEN: ${{ secrets.SYNAPSE_AUTH_TOKEN }}
19+
TEST_PROJECT_SYNID: syn7208886
20+
jobs:
21+
build-annotator:
22+
runs-on: ubuntu-latest
23+
24+
steps:
25+
- name: Checkout workflow repo
26+
uses: actions/checkout@v4
27+
28+
- name: Set up Java
29+
uses: actions/setup-java@v4
30+
with:
31+
java-version: ${{ env.JAVA_VERSION }}
32+
distribution: ${{ env.JAVA_DISTRIBUTION }}
33+
34+
- name: Install Maven
35+
run: |
36+
wget https://archive.apache.org/dist/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.zip
37+
unzip apache-maven-${MAVEN_VERSION}-bin.zip
38+
echo "$PWD/apache-maven-${MAVEN_VERSION}/bin" >> $GITHUB_PATH
39+
40+
- name: Install Python & Synapse client
41+
run: |
42+
python3 -m pip install --upgrade pip
43+
pip install synapseclient chardet
44+
45+
- name: Clone Genome Nexus Annotation Pipeline and check out commit hash
46+
run: |
47+
git clone https://github.com/genome-nexus/genome-nexus-annotation-pipeline.git
48+
cd genome-nexus-annotation-pipeline
49+
git checkout ${{ env.COMMIT_HASH }}
50+
51+
- name: Configure application.properties and log4j.properties to be GENIE-specific
52+
run: |
53+
cd genome-nexus-annotation-pipeline
54+
CONFIG_PATH="annotationPipeline/src/main/resources"
55+
mkdir -p "$CONFIG_PATH"
56+
57+
# Copy example configs first
58+
cp "${CONFIG_PATH}/application.properties.EXAMPLE" "${CONFIG_PATH}/application.properties"
59+
cp "${CONFIG_PATH}/log4j.properties.EXAMPLE" "${CONFIG_PATH}/log4j.properties"
60+
61+
# Modify application.properties lines in place
62+
sed -i \
63+
-e 's|^spring.batch.job.enabled=.*|spring.batch.job.enabled=false|' \
64+
-e 's|^spring.jmx.enabled=.*|spring.jmx.enabled=false|' \
65+
-e 's|^chunk=.*|chunk=100|' \
66+
-e 's|^genomenexus.enrichment_fields=.*|genomenexus.enrichment_fields=annotation_summary,sift,polyphen,my_variant_info|' \
67+
-e 's|^genomenexus.isoform_query_parameter=.*|genomenexus.isoform_query_parameter=isoformOverrideSource|' \
68+
-e 's|^genomenexus.base=.*|genomenexus.base=https://genie.genomenexus.org/|' \
69+
"${CONFIG_PATH}/application.properties"
70+
71+
# Modify log4j.properties line
72+
sed -i \
73+
-e 's|^log4j.appender.a.File.*|log4j.appender.a.File=/tmp/genomenexus-logfile.log|' \
74+
"${CONFIG_PATH}/log4j.properties"
75+
76+
echo "[INFO] Application and log configs updated from EXAMPLE files."
77+
78+
- name: Build annotator.jar (skips tests)
79+
run: |
80+
cd genome-nexus-annotation-pipeline
81+
mvn clean install -DskipTests
82+
find . -name annotator.jar
83+
84+
- name: Upload annotator.jar, application.properties and log4j.properties to Synapse
85+
run: |
86+
cd genome-nexus-annotation-pipeline
87+
# Locate the built JAR dynamically
88+
FILE_PATH=$(ls -t annotationPipeline/target/annotationPipeline-*.jar | head -n 1)
89+
90+
if [ ! -f "$FILE_PATH" ]; then
91+
echo "Could not find annotationPipeline JAR under annotationPipeline/target/"
92+
echo "Available .jar files:"
93+
find annotationPipeline/target -type f -name "*.jar" || true
94+
exit 1
95+
fi
96+
97+
echo "Found JAR: $FILE_PATH"
98+
99+
# Rename to annotator.jar
100+
TARGET_PATH="annotationPipeline/target/annotator.jar"
101+
cp "$FILE_PATH" "$TARGET_PATH"
102+
103+
echo "Renamed to: $TARGET_PATH"
104+
ls -lh "$TARGET_PATH"
105+
106+
echo "Uploading to Synapse entity ${{ env.OUTPUT_FOLDER_SYNAPSE_ID }}..."
107+
108+
synapse login
109+
110+
# Upload annotator.jar file!
111+
synapse store "$TARGET_PATH" --parentid ${{ env.OUTPUT_FOLDER_SYNAPSE_ID }}
112+
113+
# Upload application and log4j files
114+
CONFIG_PATH="annotationPipeline/src/main/resources"
115+
synapse store "${CONFIG_PATH}/application.properties" --parentid ${{ env.OUTPUT_FOLDER_SYNAPSE_ID }}
116+
synapse store "${CONFIG_PATH}/log4j.properties" --parentid ${{ env.OUTPUT_FOLDER_SYNAPSE_ID }}
117+
118+
check-annotator-build:
119+
needs: build-annotator
120+
runs-on: ubuntu-latest
121+
steps:
122+
- name: Checkout repository
123+
uses: actions/checkout@v4
124+
125+
- name: Pull Public Docker Image from GHCR
126+
run: |
127+
docker pull ${{ env.PROD_DOCKER }}
128+
129+
- name: Start Docker Container
130+
run: |
131+
docker run -d --name genie-container \
132+
-e SYNAPSE_AUTH_TOKEN="${{ env.SYNAPSE_AUTH_TOKEN }}" \
133+
${{ env.PROD_DOCKER }} \
134+
sh -c "while true; do sleep 1; done"
135+
136+
- name: Run processing on mutation data in test pipeline
137+
run: |
138+
docker exec genie-container \
139+
python3 /root/Genie/bin/input_to_database.py mutation \
140+
--project_id ${{ env.TEST_PROJECT_SYNID }} \
141+
--genie_annotation_pkg /root/annotation-tools \
142+
--createNewMafDatabase
143+
144+
- name: Stop and Remove Docker Container
145+
run: docker stop genie-container && docker rm genie-container

0 commit comments

Comments
 (0)