Skip to content

Commit 221eae5

Browse files
authored
[GEN-2332] Add workflow to build genome nexus annotator (#618)
* initial code to build annotator * fix dep * make all build dep as env vars, update docs * fix clone path * remove missing param, add latest commit hash * find jar file name dynamically * rename to annotator.jar, adjust output id * fix spacing, change env var hierarchy * update spacing * only run when you modify this file * add default value * add back commit hash * remove workflow_dispatch * only modify lines in config files, save to Genome Nexus Testing folder * update docs
1 parent 5bd6e00 commit 221eae5

File tree

2 files changed

+160
-2
lines changed

2 files changed

+160
-2
lines changed

.github/workflows/README.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,9 +104,22 @@ Only triggers the `lint`, `tests`, `build-container`, `determine-changes` and `i
104104

105105
## automate_truststore.yml
106106

107-
This workflow will update the Genome nexus truststor file on a schedule and then run the mutation processing AND consortium release steps of the pipeline to make sure the pipeline is workfing with the new truststore.
107+
This workflow will update the Genome nexus truststore file on a schedule and then run the mutation processing AND consortium release steps of the pipeline to make sure the pipeline is working with the new truststore.
108108

109109
The truststore used in our pipeline to run the genome nexus annotator on MAF data.
110110

111111
For more background on the truststore and troubleshooting, please see:
112-
[Updating Genome Nexus Annotator and Dependencies](https://sagebionetworks.jira.com/wiki/spaces/APGD/pages/3016687662/Updating+Genome+Nexus+Annotator+and+Dependencies#Updating-the-trust-ssl-file)
112+
[Updating Genome Nexus Annotator and Dependencies](https://sagebionetworks.jira.com/wiki/spaces/APGD/pages/3016687662/Updating+Genome+Nexus+Annotator+and+Dependencies#Updating-the-trust-ssl-file)
113+
114+
115+
## build_genome_nexus_annotator.yml
116+
117+
This workflow will update and build the Genome Nexus `annotator.jar` file from a
118+
user inputted param: `commit_hash`. This `commit_hash` value comes from pulling from a specific commit in the [genome-nexus-annotation-pipeline](github.com/genome-nexus/genome-nexus-annotation-pipeline).
119+
120+
This workflow runs **on demand** [through triggering the workflow via a pull request to either modify the commit hash or updating something else](https://github.com/Sage-Bionetworks/Genie/actions/workflows/build_genome_nexus_annotator.yml).
121+
122+
After a successful build and upload to the [Genome Nexus Testing folder on synapse](https://www.synapse.org/Synapse:syn70781006), it will run the mutation processing step of the pipeline to annotate the test data. The manual part will be checking the annotated test data after this to make sure the results are expected.
123+
124+
For more background on the `annotator.jar`, validating the annotated results after the build and troubleshooting, please see:
125+
[Updating Genome Nexus Annotator and Dependencies](https://sagebionetworks.jira.com/wiki/spaces/APGD/pages/3016687662/Updating+Genome+Nexus+Annotator+and+Dependencies#Updating-the-annotator.jar)
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
name: Build and Upload Genome Nexus Annotator
2+
3+
on:
4+
push:
5+
paths:
6+
- '.github/workflows/build_genome_nexus_annotator.yml'
7+
8+
env:
9+
# versions for dependencies used in annotator.jar build
10+
MAVEN_VERSION: '3.9.9'
11+
COMMIT_HASH: "2e67ebd08cf7c26bf1f55f2baf4b73ac36531119"
12+
JAVA_VERSION: "21"
13+
JAVA_DISTRIBUTION: "corretto"
14+
# synapse id of the Genome Nexus Testing folder
15+
OUTPUT_FOLDER_SYNAPSE_ID: syn70781006
16+
# env vars related to testing the annotator.jar
17+
PROD_DOCKER: ghcr.io/sage-bionetworks/genie:main
18+
SYNAPSE_AUTH_TOKEN: ${{ secrets.SYNAPSE_AUTH_TOKEN }}
19+
TEST_PROJECT_SYNID: syn7208886
20+
jobs:
21+
build-annotator:
22+
runs-on: ubuntu-latest
23+
24+
steps:
25+
- name: Checkout workflow repo
26+
uses: actions/checkout@v4
27+
28+
- name: Set up Java
29+
uses: actions/setup-java@v4
30+
with:
31+
java-version: ${{ env.JAVA_VERSION }}
32+
distribution: ${{ env.JAVA_DISTRIBUTION }}
33+
34+
- name: Install Maven
35+
run: |
36+
wget https://archive.apache.org/dist/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.zip
37+
unzip apache-maven-${MAVEN_VERSION}-bin.zip
38+
echo "$PWD/apache-maven-${MAVEN_VERSION}/bin" >> $GITHUB_PATH
39+
40+
- name: Install Python & Synapse client
41+
run: |
42+
python3 -m pip install --upgrade pip
43+
pip install synapseclient chardet
44+
45+
- name: Clone Genome Nexus Annotation Pipeline and check out commit hash
46+
run: |
47+
git clone https://github.com/genome-nexus/genome-nexus-annotation-pipeline.git
48+
cd genome-nexus-annotation-pipeline
49+
git checkout ${{ env.COMMIT_HASH }}
50+
51+
- name: Configure application.properties and log4j.properties to be GENIE-specific
52+
run: |
53+
cd genome-nexus-annotation-pipeline
54+
CONFIG_PATH="annotationPipeline/src/main/resources"
55+
mkdir -p "$CONFIG_PATH"
56+
57+
# Copy example configs first
58+
cp "${CONFIG_PATH}/application.properties.EXAMPLE" "${CONFIG_PATH}/application.properties"
59+
cp "${CONFIG_PATH}/log4j.properties.EXAMPLE" "${CONFIG_PATH}/log4j.properties"
60+
61+
# Modify application.properties lines in place
62+
sed -i \
63+
-e 's|^spring.batch.job.enabled=.*|spring.batch.job.enabled=false|' \
64+
-e 's|^spring.jmx.enabled=.*|spring.jmx.enabled=false|' \
65+
-e 's|^chunk=.*|chunk=100|' \
66+
-e 's|^genomenexus.enrichment_fields=.*|genomenexus.enrichment_fields=annotation_summary,sift,polyphen,my_variant_info|' \
67+
-e 's|^genomenexus.isoform_query_parameter=.*|genomenexus.isoform_query_parameter=isoformOverrideSource|' \
68+
-e 's|^genomenexus.base=.*|genomenexus.base=https://genie.genomenexus.org/|' \
69+
"${CONFIG_PATH}/application.properties"
70+
71+
# Modify log4j.properties line
72+
sed -i \
73+
-e 's|^log4j.appender.a.File.*|log4j.appender.a.File=/tmp/genomenexus-logfile.log|' \
74+
"${CONFIG_PATH}/log4j.properties"
75+
76+
echo "[INFO] Application and log configs updated from EXAMPLE files."
77+
78+
- name: Build annotator.jar (skips tests)
79+
run: |
80+
cd genome-nexus-annotation-pipeline
81+
mvn clean install -DskipTests
82+
find . -name annotator.jar
83+
84+
- name: Upload annotator.jar, application.properties and log4j.properties to Synapse
85+
run: |
86+
cd genome-nexus-annotation-pipeline
87+
# Locate the built JAR dynamically
88+
FILE_PATH=$(ls -t annotationPipeline/target/annotationPipeline-*.jar | head -n 1)
89+
90+
if [ ! -f "$FILE_PATH" ]; then
91+
echo "Could not find annotationPipeline JAR under annotationPipeline/target/"
92+
echo "Available .jar files:"
93+
find annotationPipeline/target -type f -name "*.jar" || true
94+
exit 1
95+
fi
96+
97+
echo "Found JAR: $FILE_PATH"
98+
99+
# Rename to annotator.jar
100+
TARGET_PATH="annotationPipeline/target/annotator.jar"
101+
cp "$FILE_PATH" "$TARGET_PATH"
102+
103+
echo "Renamed to: $TARGET_PATH"
104+
ls -lh "$TARGET_PATH"
105+
106+
echo "Uploading to Synapse entity ${{ env.OUTPUT_FOLDER_SYNAPSE_ID }}..."
107+
108+
synapse login
109+
110+
# Upload annotator.jar file!
111+
synapse store "$TARGET_PATH" --parentid ${{ env.OUTPUT_FOLDER_SYNAPSE_ID }}
112+
113+
# Upload application and log4j files
114+
CONFIG_PATH="annotationPipeline/src/main/resources"
115+
synapse store "${CONFIG_PATH}/application.properties" --parentid ${{ env.OUTPUT_FOLDER_SYNAPSE_ID }}
116+
synapse store "${CONFIG_PATH}/log4j.properties" --parentid ${{ env.OUTPUT_FOLDER_SYNAPSE_ID }}
117+
118+
check-annotator-build:
119+
needs: build-annotator
120+
runs-on: ubuntu-latest
121+
steps:
122+
- name: Checkout repository
123+
uses: actions/checkout@v4
124+
125+
- name: Pull Public Docker Image from GHCR
126+
run: |
127+
docker pull ${{ env.PROD_DOCKER }}
128+
129+
- name: Start Docker Container
130+
run: |
131+
docker run -d --name genie-container \
132+
-e SYNAPSE_AUTH_TOKEN="${{ env.SYNAPSE_AUTH_TOKEN }}" \
133+
${{ env.PROD_DOCKER }} \
134+
sh -c "while true; do sleep 1; done"
135+
136+
- name: Run processing on mutation data in test pipeline
137+
run: |
138+
docker exec genie-container \
139+
python3 /root/Genie/bin/input_to_database.py mutation \
140+
--project_id ${{ env.TEST_PROJECT_SYNID }} \
141+
--genie_annotation_pkg /root/annotation-tools \
142+
--createNewMafDatabase
143+
144+
- name: Stop and Remove Docker Container
145+
run: docker stop genie-container && docker rm genie-container

0 commit comments

Comments
 (0)