- 
                Notifications
    
You must be signed in to change notification settings  - Fork 124
 
feat(types): Support Unknown Type for v3 table spec #605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
        
          
                schema.go
              
                Outdated
          
        
      | func (s *Schema) init() { | ||
| // Validate unknown type requirements | ||
| if err := s.validateUnknownTypes(); err != nil { | ||
| panic(fmt.Sprintf("Invalid schema: %v", err)) | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this return error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. I think returning this in init() will have a cascading effect on many places in the code base. Let me look for some other places to validate the unknownTypes else, I'm happy to remove the validation of unknown type altogether :D . wydt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some schema validations in checkSchemaCompatibility, maybe you can fit it in there.
Eventually, we should bite the bullet and make the schema constructor fallible. While it's going to cascade and cause a lot of breakage, it'll give us a much safer type where we know that if it exists, it's in a valid configuration.
| 
           @zeroshade Not sure if my understanding of the specs is correct but do we support UnknownType as nested type or only in the top-level column type? 🤔 Could you advise me on this?  | 
    
| 
           @dttung2905, I'll dig a bit in the java code to see how they're handling that. Without having done that - to me it makes perfect sense to allow Unknown types within nested types. There may be a bunch of known types mixed with a single unsupported type in such a nested type. Allowing unknown here enables reading that.  | 
    
| 
           @dttung2905, it's supported in nested types: 
   @Test
  public void testUnknownSupport() {
    // this needs a different schema because it cannot be used in required fields
    Schema schemaWithUnknown =
        new Schema(
            Types.NestedField.required(1, "id", Types.LongType.get()),
            Types.NestedField.optional(2, "top", Types.UnknownType.get()),
            Types.NestedField.optional(
                3, "arr", Types.ListType.ofOptional(4, Types.UnknownType.get())),
            Types.NestedField.required(
                5,
                "struct",
                Types.StructType.of(
                    Types.NestedField.optional(6, "inner_op", Types.UnknownType.get()),
                    Types.NestedField.optional(
                        7,
                        "inner_map",
                        Types.MapType.ofOptional(
                            8, 9, Types.StringType.get(), Types.UnknownType.get())),
                    Types.NestedField.optional(
                        10,
                        "struct_arr",
                        Types.StructType.of(
                            Types.NestedField.optional(11, "deep", Types.UnknownType.get()))))));
    assertThatThrownBy(() -> Schema.checkCompatibility(schemaWithUnknown, 2))
        .isInstanceOf(IllegalStateException.class)
        .hasMessage(
            "Invalid schema for v%s:\n"
                + "- Invalid type for top: %s is not supported until v%s\n"
                + "- Invalid type for arr.element: %s is not supported until v%s\n"
                + "- Invalid type for struct.inner_op: %s is not supported until v%s\n"
                + "- Invalid type for struct.inner_map.value: %s is not supported until v%s\n"
                + "- Invalid type for struct.struct_arr.deep: %s is not supported until v%s",
            2,
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN),
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN),
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN),
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN),
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN));
    assertThatCode(() -> Schema.checkCompatibility(schemaWithUnknown, 3))
        .doesNotThrowAnyException();
  } | 
    
| 
           thanks @twuebi for digging and confirming that  | 
    
Signed-off-by: dttung2905 <[email protected]>
Signed-off-by: dttung2905 <[email protected]>
Signed-off-by: dttung2905 <[email protected]>
Signed-off-by: dttung2905 <[email protected]>
4f00452    to
    4914ad9      
    Compare
  
    Signed-off-by: dttung2905 <[email protected]>
Signed-off-by: dttung2905 <[email protected]>
Signed-off-by: dttung2905 <[email protected]>
Uh oh!
There was an error while loading. Please reload this page.