Skip to content

Conversation

@tanishq-chugh
Copy link
Contributor

… UDFs

What changes were proposed in this pull request?

Extend support for cipher transformations in AES UDFs with Counter (CTR) & Galois/Counter Mode (GCM) modes that provide stronger security as they use Initialization Vector.

Why are the changes needed?

Currently, The AES UDFs only support one cipher transformation - AES/ECB/PKCS5Padding, which is inherently weak, as it produces the same ciphertext for identical blocks of plain text.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually Tested

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fundamental question: If I create a table using this UDF with say X as the session config in HIVE_UDF_AES_CIPHER_TRANSFORMATION and some Y in another session while reading. The data gets corrupted?

  • How do I figure it out once write what was the value of the config while writing?
  • If I wrote a column with say X as the value of the UDF and another existing column with Y as the value, how do I read both the columns in one query

@tanishq-chugh
Copy link
Contributor Author

tanishq-chugh commented Jan 6, 2026

Thanks for checking this @ayushtkn , and the concern is absolutely valid. As of now, there's no way to figure out the value once write is done and also, using different values at time of write/read will lead to data corruption.

I am considering two options moving ahead.

  1. Going ahead with modifying the UDF definition of it allowing a maximum of 3 arguments instead of session level config.
    3rd argument being optional for this transformation value, with default to the current value. This is needed for backward compatibility.
    Figuring out the transformation value at the time of read, would still remain the user responsibility. (Query History could be of help here).

  2. In addition to modifying the UDF definition, Figuring out the mode at time of decryption could be automated if we have only GCM & AES (default mode) options, as falling back to legacy mode could be an option when decryption runs into AEADBadTagException.

Let me know your thoughts on this.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 7, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants