Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# limitations under the License.

FROM apache/hadoop-runner
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @jojochuang and @ayushtkn . Do you know what we need to do to get an updated apache/hadoop-runner base image with an upgrade to Java 17 so that we can bundle Hadoop 3.5.0?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hadoop-runner uses Ubuntu 22.04 as the base image, which is inconsistent with the Hadoop release dev container - Hadoop 3.5.x uses Ubuntu 24.04, Hadoop 3.4.x uses Ubuntu 20.04, should we align that?

Copy link
Copy Markdown
Contributor Author

@cnauroth cnauroth Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking what we need to do is:

  1. Release a new apache/hadoop-runner:jdk17-u2204apache/hadoop-runner:jdk17-u2404 that aligns the OS with the release container like Cheng suggested.
  2. Additionally, the base container needs another change to set ownership of /opt to the hadoop user so that it has permission to install the Hadoop bits. This line in apache/hadoop-runner:latest is missing from apache/hadoop-runner:jdk17-u2204. Actually these 2 images are quite different. I wonder if they should be more aligned.
  3. Create a new docker-hadoop-3.5 branch that uses the new apache/hadoop-runner:jdk17-u2204 for the base image. This way, we'll be able to independently release images for new 3.4 and 3.5 releases.

I'll proceed with this approach in a little while unless others have feedback.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be apache/hadoop-runner:jdk17-u2404? (not u2204)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for the correction. The intent is to align on Ubuntu 24.04. I updated the comment.

ARG HADOOP_URL=https://dlcdn.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz
ARG HADOOP_URL=https://dlcdn.apache.org/hadoop/common/hadoop-3.5.0/hadoop-3.5.0.tar.gz
WORKDIR /opt
RUN sudo rm -rf /opt/hadoop && curl -LSs -o hadoop.tar.gz $HADOOP_URL && tar zxf hadoop.tar.gz && rm hadoop.tar.gz && mv hadoop* hadoop && rm -rf /opt/hadoop/share/doc
WORKDIR /opt/hadoop
Expand Down
Loading