Skip to content

Conversation

@pratapaditya04
Copy link
Contributor

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):

In Avro 1.10 and above, GenericRecord.get(String fieldName) strictly validates the field name against the record schema and throws AvroRuntimeException if the field does not exist.
Existing Gobblin code in AvroUtils.getFieldHelper and getFieldValue directly accesses fields without verifying their presence, which causes runtime failures such as:

org.apache.avro.AvroRuntimeException: Not a valid schema field: EOF
This change adds a schema field existence check before calling record.get(fieldName), ensuring compatibility across Avro 1.9 and 1.10+ versions. When a field is missing, the method now safely skips or returns Optional.absent() instead of throwing.

Key Changes:

Added record.getSchema().getField(fieldName) null check before field access.
Gracefully handle invalid/missing fields with debug log and safe return.
Ensures backward compatibility with Avro 1.9 and prevents runtime exceptions in 1.10

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Added unit tests

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

@Blazer-007 Blazer-007 requested a review from Copilot October 31, 2025 15:00
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the robustness of the AvroUtils.getFieldHelper method by adding safe field access for GenericRecord objects and includes comprehensive test coverage for the getMultiFieldValue method.

  • Added a new getSafeField helper method that safely retrieves fields from GenericRecords without throwing exceptions
  • Refactored getFieldHelper to use explicit instanceof GenericRecord checks and the new safe field access method
  • Added tests covering existing field access, missing field handling, and nested field access scenarios

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
gobblin-utility/src/main/java/org/apache/gobblin/util/AvroUtils.java Introduced getSafeField method for null-safe field access and refactored getFieldHelper to handle missing fields gracefully
gobblin-utility/src/test/java/org/apache/gobblin/util/AvroUtilsTest.java Added three new test cases for getMultiFieldValue method and fixed trailing whitespace

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Blazer-007 Blazer-007 merged commit c12e321 into apache:master Nov 3, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants