-
Notifications
You must be signed in to change notification settings - Fork 16
feat: Modify the Block Node build for dynamic plugins #2011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Mustafa Uzun <[email protected]>
Signed-off-by: Mustafa Uzun <[email protected]>
Signed-off-by: Mustafa Uzun <[email protected]>
Signed-off-by: Mustafa Uzun <[email protected]>
ata-nas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great addition. I leave some comments, mostly questions to verify.
| environment: | ||
| PLUGINS: "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: by default, for local development, will we include all plugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have 3 possibilities for local development:
- To use it with core plugins only and if we want to add additional ones
- To use it with all plugins configured through gradle by using the image we use for e2e tests which contains all plugins
- To use it with all plugins and just list all of the plugins in docker-compose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main question is what will happen, by default, when I execute ./gradlew startDockerContainer? Will I get a block node with all plugins? It should, in my mind, be so, in order to have the absolute easiest time while developing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The base image, the one that is published, should have no plugins at all; zero (0); not even "core" plugins.
It makes sense, however, to have the startDockerContainer task depend on a testing-only task (e.g. buildTestImage) that uses compose to create a temporary image that includes everything.
This implies that the list of plugins in this YAML file should not be empty (assuming this list is what gets loaded into the lib location by the compose task).
| done | ||
| fi | ||
|
|
||
| exec "${BN_WORKDIR}/app-${VERSION}/bin/app" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exec "${BN_WORKDIR}/app-${VERSION}/bin/app" should probably be outside of the script, as this starts the app itself. In general, I think also the name of the script is not very fitting. I wonder, should we have the download-plugins script, which only downloads the plugins and have the starting of the app separate? Just thinking out loud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move the execution part from the script to Dockerfile as it was initially. And the Dockerfile entrypoint could be
ENTRYPOINT ["/bin/bash", "-c", "${BN_WORKDIR}/download-and-load-plugins.sh && ${BN_WORKDIR}/app-${VERSION}/bin/app || sleep 30d"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you, I could go either way. Just thinking that it would probably be a better option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker image should not, in my opinion, include the plugin loading.
Plugins should be read on startup by the application (not a script), based on configuration, from a well defined bind mount/volume mount location.
The user of the image (i.e. operator) is then responsible for downloading and installing plugins to the defined location.
| if [ -n "${PLUGINS:-}" ]; then | ||
| echo "$PLUGINS" | tr ',' '\n' | while read -r plugin; do | ||
| curl -L -o "${BN_WORKDIR}/app-${VERSION}/lib/${plugin}.jar" \ | ||
| "https://repo1.maven.org/maven2/org/hiero/block-node/${plugin}/0.25.0-rc1/${plugin}-0.25.0-rc1.jar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"https://repo1.maven.org/maven2/org/hiero/block-node/${plugin}/0.25.0-rc1/${plugin}-0.25.0-rc1.jar"
These versions should be dynamic. Also, for readability, we could interpolate the string into a variable and then use that variable in the curl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will change 0.25.0-rc1 to ${VERSION} when versions align as mentioned in the comment. The reason for hardcoding now is because we don't have plugins published for 0.26 which is the current VERSION in the repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted, we could link this to a GH issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will only ever align on release branches, for main we have a catch-22 problem, we need to think this over, one proposal is to use the jars from the build instead. another is to see where are the SNAPSHOT if they exist at all in Maven and use the different address or method of getting them.
| runtimeOnly("org.hiero.block.node.messaging") | ||
| runtimeOnly("org.hiero.block.node.health") | ||
| runtimeOnly("org.hiero.block.node.stream.publisher") | ||
| runtimeOnly("org.hiero.block.node.stream.subscriber") | ||
| runtimeOnly("org.hiero.block.node.verification") | ||
| runtimeOnly("org.hiero.block.node.blocks.files.historic") | ||
| runtimeOnly("org.hiero.block.node.blocks.files.recent") | ||
| runtimeOnly("org.hiero.block.node.access.service") | ||
| runtimeOnly("org.hiero.block.node.server.status") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the remaining plugins in the mainModuleInfo considered "core"? I would think that server status and access service are probably optional? Really the only "core" plugins are messaging and health, as the app itself is an HealthFacility and messaging must be loaded in order to startup the app. Personally, I think if a plugin's absence does not in any way hinder the ability to startup the app, it should probably be considered "optional". That, however, is indeed only technically speaking. I do see an argument saying that health and access could be considered "core" design wise. Not sure about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with both of the ideas. We can discuss it with the team in this thread and decide which plugins should be considered core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion:
No plugins should be considered "core". All plugins (including messaging, health, status, etc...) should be loaded at runtime from a folder that is filled in by the deployment process (not the build process).
We made these plugins so we could replace them if desired, but continuing to include them in the base (i.e. "barebones") image prevents that replacement.
The base image really shouldn't have anything other than the app module and non-pluggable modules (e.g. helidon, config, metrics).
| @@ -0,0 +1,11 @@ | |||
| #!/bin/bash | |||
|
|
|||
| # Change 0.25.0-rc1 to ${VERSION} when versions align | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment suggests that a change needs to be made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will change 0.25.0-rc1 to ${VERSION} when versions align as mentioned in the comment. The reason for hardcoding now is because we don't have plugins published for 0.26 which is the current VERSION in the repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than hardcode, perhaps accept the version as a script parameter and pass that in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we do a release, we do the publish of the OCI images and plugins in parallel, after this change, we need to publish OCI image only if the plugins were sucessfully published to maven, so subsequent tasks don't fail, that will require changes in https://github.com/hiero-ledger/hiero-block-node/blob/main/.github/workflows/release-push-image.yaml to make the release consistent and that chart test won't fail:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For local build using snapshot versions, we should get them from build rather than maven, that or maybe maven someplace is storing the -SNAPSHOT versions, and overwriting it with every publish (which happens on each commit to main)
see the latest as example: https://github.com/hiero-ledger/hiero-block-node/actions/runs/20973106476
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: What is the user story behind "minimal"? I do not see the backfill plugin, which is the only difference with lfh. But do we really ned the historic plugin for a "minimal" install? Answering the user story question will bring more clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minimal I understand as the minimum resource-demanding set of plugins and functional block node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, Minimal is about resources, not plugins.
All current plugins are needed for minimal as it's a configuration for development and testing.
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## main #2011 +/- ##
============================================
- Coverage 78.99% 78.86% -0.14%
+ Complexity 1241 1234 -7
============================================
Files 130 130
Lines 5952 5952
Branches 646 646
============================================
- Hits 4702 4694 -8
- Misses 955 960 +5
- Partials 295 298 +3 see 8 files with indirect coverage changes 🚀 New features to boost your workflow:
|
| # Install Java | ||
| COPY --from=java-builder ${JAVA_HOME}/ ${JAVA_HOME}/ | ||
| # COPY bash script to container | ||
| COPY download-and-load-plugins.sh ${BN_WORKDIR}/download-and-load-plugins.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this belongs in the application container.
Downloading and loading plugins should happen exclusively outside of the main application container.
|
|
||
| # RUN the bin script for starting the server | ||
| ENTRYPOINT ["/bin/bash", "-c", "${BN_WORKDIR}/app-${VERSION}/bin/app || sleep 30d"] | ||
| ENTRYPOINT ["/bin/bash", "-c", "${BN_WORKDIR}/download-and-load-plugins.sh || sleep 30d"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems off. The Application OCI image should launch the app, not a download process.
The intent of the base image setup is to have an image with no plugins at all. Then the deployment process (helm chart or otherwise) loads the plugins to an externally mounded volume (e.g. bind-mount or similar) that corresponds to the application lib folder inside the image.
| @@ -0,0 +1,3 @@ | |||
| blockNode: | |||
| config: | |||
| PLUGINS: "stream-publisher,stream-subscriber,verification,blocks-file-historic,blocks-file-recent,backfill,s3-archive" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing: messaging, status, health, block-access-service.
| @@ -0,0 +1,3 @@ | |||
| blockNode: | |||
| config: | |||
| PLUGINS: "stream-publisher,stream-subscriber,verification,blocks-file-historic,blocks-file-recent,backfill,s3-archive" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing: messaging, status, health, block-access-service.
Remove: blocks-file-historic, stream-subscriber
Open question: Do we require blocks-file-recent, or can we remove that as well?
| # key: value | ||
| JAVA_TOOL_OPTIONS: "-Djava.util.logging.config.file=/opt/hiero/block-node/logs/config/logging.properties" | ||
| JAVA_OPTS: '-Xms16G -Xmx16G -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath="/tmp/dump.hprof"' | ||
| PLUGINS: "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing: messaging, status, health, block-access-service.
| @@ -0,0 +1,3 @@ | |||
| blockNode: | |||
| config: | |||
| PLUGINS: "stream-publisher,stream-subscriber,verification,blocks-file-historic,blocks-file-recent" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing: messaging, status, health, block-access-service.
| @@ -0,0 +1,3 @@ | |||
| blockNode: | |||
| config: | |||
| PLUGINS: "stream-publisher,stream-subscriber,verification,blocks-file-historic,blocks-file-recent,backfill" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing: messaging, status, health, block-access-service.
| mainModuleInfo { | ||
| runtimeOnly("org.hiero.block.node.archive.s3cloud") | ||
| runtimeOnly("org.hiero.block.node.stream.publisher") | ||
| runtimeOnly("org.hiero.block.node.stream.subscriber") | ||
| runtimeOnly("org.hiero.block.node.verification") | ||
| runtimeOnly("org.hiero.block.node.blocks.files.historic") | ||
| runtimeOnly("org.hiero.block.node.blocks.files.recent") | ||
| runtimeOnly("org.hiero.block.node.backfill") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this truly required?
The module runtime requirements should be dynamically discovered by Java based on the jars present in the lib folder (otherwise how would we add new plugins without changing the build process?)...
AlfredoG87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just a few nits, a couple suggestions, and a blocking issue: the version thing.
👍
| @@ -0,0 +1,11 @@ | |||
| #!/bin/bash | |||
|
|
|||
| # Change 0.25.0-rc1 to ${VERSION} when versions align | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For local build using snapshot versions, we should get them from build rather than maven, that or maybe maven someplace is storing the -SNAPSHOT versions, and overwriting it with every publish (which happens on each commit to main)
see the latest as example: https://github.com/hiero-ledger/hiero-block-node/actions/runs/20973106476
| @@ -0,0 +1,11 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing set -e for fail-fast behavior. If curl fails to download a plugin, the script continues and starts the app with missing dependencies.
#!/bin/bash
| #!/bin/bash | |
| #!/bin/bash | |
| set -e |
| if [ -n "${PLUGINS:-}" ]; then | ||
| echo "$PLUGINS" | tr ',' '\n' | while read -r plugin; do | ||
| curl -L -o "${BN_WORKDIR}/app-${VERSION}/lib/${plugin}.jar" \ | ||
| "https://repo1.maven.org/maven2/org/hiero/block-node/${plugin}/0.25.0-rc1/${plugin}-0.25.0-rc1.jar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will only ever align on release branches, for main we have a catch-22 problem, we need to think this over, one proposal is to use the jars from the build instead. another is to see where are the SNAPSHOT if they exist at all in Maven and use the different address or method of getting them.
| echo "$PLUGINS" | tr ',' '\n' | while read -r plugin; do | ||
| curl -L -o "${BN_WORKDIR}/app-${VERSION}/lib/${plugin}.jar" \ | ||
| "https://repo1.maven.org/maven2/org/hiero/block-node/${plugin}/0.25.0-rc1/${plugin}-0.25.0-rc1.jar" | ||
| done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No verification of downloaded artifacts. Issue #1442 mentions "SHA-512 hash verification" as a potential requirement. At minimum, consider checking curl exit status and file existence
| # Change 0.25.0-rc1 to ${VERSION} when versions align | ||
| if [ -n "${PLUGINS:-}" ]; then | ||
| echo "$PLUGINS" | tr ',' '\n' | while read -r plugin; do | ||
| curl -L -o "${BN_WORKDIR}/app-${VERSION}/lib/${plugin}.jar" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using -fL instead of just -L would make curl fail on HTTP errors (4xx/5xx) rather than saving error pages as jar files.
|
|
||
| # Install Java | ||
| COPY --from=java-builder ${JAVA_HOME}/ ${JAVA_HOME}/ | ||
| # COPY bash script to container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The comment "COPY bash script to container" is redundant—the COPY command is self-explanatory
| @@ -0,0 +1,3 @@ | |||
| blockNode: | |||
| config: | |||
| PLUGINS: "stream-publisher,stream-subscriber,verification,blocks-file-historic,blocks-file-recent" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] Indentation uses 8 spaces for the PLUGINS key, but looking at other values-overrides files like mid-size.yaml, they use 4 spaces. Should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same applies to lfh.yaml, rfh.yaml, and all-plugins.yaml
Reviewer Notes
Related Issue(s)
Fixes #1657 #1442