Skip to content

[ZEPPELIN-6204] do not allow init params in JDBC URLs for H2#4949

Merged
tbonelee merged 14 commits into
apache:masterfrom
pjfanning:ZEPPELIN-6204-jdbc
Nov 11, 2025
Merged

[ZEPPELIN-6204] do not allow init params in JDBC URLs for H2#4949
tbonelee merged 14 commits into
apache:masterfrom
pjfanning:ZEPPELIN-6204-jdbc

Conversation

@pjfanning

Copy link
Copy Markdown
Member

What is this PR for?

ZEPPELIN-6204

Slight tidy of the existing disallow list for strings in JDBC urls so that they are checked against just the query params and not the hostname in the URL.

What type of PR is it?

Bug Fix

Todos

  • - Task

What is the Jira issue?

How should this be tested?

  • Strongly recommended: add automated unit tests for any new or changed behavior
  • Outline any manual steps to test the PR here.

Screenshots (if appropriate)

Questions:

  • Does the license files need to update? no
  • Is there breaking changes for older versions? no
  • Does this needs documentation? no

@Reamer

Reamer commented Jun 26, 2025

Copy link
Copy Markdown
Contributor

I don't know whether we are preventing features that others want to use with these changes.

Doesn't it make more sense to secure the H2 server according to the requirements? After all, Zeppelin is just one JDBC client of many.

Reamer
Reamer previously approved these changes Jul 1, 2025
@pjfanning pjfanning marked this pull request as draft July 2, 2025 11:34
@pjfanning pjfanning marked this pull request as ready for review July 8, 2025 10:06
@pjfanning

Copy link
Copy Markdown
Member Author

@Reamer @jongyoul is it ok to consider this for merge now? There are probably a few more query params that could be added to the disallow list but this covers some of the more commonly mentioned ones.

@tbonelee

Copy link
Copy Markdown
Contributor

Just a small thought. I was wondering if decoding the full URL before parsing could lead to unexpected behavior in some edge cases. For example, if a parameter value contains encoded characters like %26 (&) or %3D (=), decoding them too early might interfere with how the parameters are parsed.

I’m not sure if values like passwords are actually allowed or commonly used in this context, so this might not be relevant. But I thought it was worth mentioning just in case. 🙏

Comment thread jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java
Comment thread jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java Outdated
Comment on lines +159 to +174
private static final String ALLOW_LOAD_LOCAL = "allowLoadLocal";

private static final String ALLOW_LOAD_LOCAL_IN_FILE_NAME = "allowLoadLocalInfile";

private static final String AUTO_DESERIALIZE = "autoDeserialize";
private static final String ALLOW_LOAD_LOCAL_IN_FILE_IN_PATH = "allowLoadLocalInfileInPath";

private static final String ALLOW_LOCAL_IN_FILE_NAME = "allowLocalInfile";

private static final String ALLOW_URL_IN_LOCAL_IN_FILE_NAME = "allowUrlInLocalInfile";

private static final String AUTO_DESERIALIZE = "autoDeserialize";

private static final String SOCKET_FACTORY = "socketFactory";

private static final String INIT = "INIT";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be nice to group these into a Set.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to stick with the existing code style.

}

// Split the URL into the base part and the parameters part
String[] parts = url.split("[?&;]");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be a bit faster if the regex was cached to avoid rebuilding every time

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the cost of this regex will be tiny compared to the I/O cost of using JDBC commands so I'm not very interested in optimising every line of code here

Comment thread jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java
@pjfanning

Copy link
Copy Markdown
Member Author

Just a small thought. I was wondering if decoding the full URL before parsing could lead to unexpected behavior in some edge cases. For example, if a parameter value contains encoded characters like %26 (&) or %3D (=), decoding them too early might interfere with how the parameters are parsed.

I’m not sure if values like passwords are actually allowed or commonly used in this context, so this might not be relevant. But I thought it was worth mentioning just in case. 🙏

The code in this PR ignores the param values. It only worries about the param names.
The original URL as entered by the user is passed to the JDBC layer - my code splits up the URL but only for my checks for suspicious param names.

@tbonelee

Copy link
Copy Markdown
Contributor

@pjfanning Thanks for the clarification. I took a closer look and you're right. Decoding encoded special characters in values before validation would at most introduce an extra key, but it does not interfere with the parsing of subsequent parameters. I had initially worried about false negatives (i.e. cases where disallowed parameters might go undetected), but that doesn't seem to be the case. Thanks again, and sorry for the confusion.

@pjfanning pjfanning force-pushed the ZEPPELIN-6204-jdbc branch from fd38a07 to fbcec66 Compare August 3, 2025 14:15
@pjfanning

Copy link
Copy Markdown
Member Author

@jongyoul @Reamer would you be able to review this change?

@pjfanning

Copy link
Copy Markdown
Member Author

@jongyoul @Reamer would you be able to review this change? I think this fixes a couple of issues in the existing implementation.

@Reamer Reamer left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pjfanning

Copy link
Copy Markdown
Member Author

@jongyoul do you think this could be merged?

@jongyoul jongyoul left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pjfanning

Copy link
Copy Markdown
Member Author

@jongyoul @Reamer I am not a committer so can't merge this. Would either of you be in a position to do so?

@tbonelee tbonelee merged commit 4882e07 into apache:master Nov 11, 2025
16 of 18 checks passed
@tbonelee

Copy link
Copy Markdown
Contributor

I’ve merged this into master.
Since I considered it a breaking change, I didn’t include it in branch-0.12.
Do you think it needs to be backported?

@pjfanning

Copy link
Copy Markdown
Member Author

Thanks @tbonelee - I would regard it more as a bug fix or fixes. It shouldn't change the behaviour for any use cases that we want to support.

@pjfanning pjfanning deleted the ZEPPELIN-6204-jdbc branch November 11, 2025 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants