Skip to content

[#5202] feat(client-python): Support Column and its default value part2 - type serializer #6903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

tsungchih
Copy link
Contributor

What changes were proposed in this pull request?

This is the second part (totally 4 planned) of implementation to the following classes from Java to support Column and its default value, including:

  • JsonUtils
  • TypeSerializer

The TypeSerializer will be used in the incoming ColumnDTO implementation to serialize data_type field.

Why are the changes needed?

We need to support Column and its default value in python client.

#5202

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests

for implementing various customized DataClassJson serializer/deserializer

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
aimed at facilitating customized DataClassJson serializer/descrializer

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add write data type in SerdesUtils

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add TypeSerializer

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add unit tests for TypesSerializer

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add write struct type in SerdesUtils

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add unit tests for serializing struct type

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
def test_list_type_of_primitive_and_none_types_serdes(self):         for simple_string, type_ in self._primitive_and_none_types.items():             list_type = Types.ListType.of(element_type=type_, element_nullable=False)             serialized_result = TypesSerdes.serialize(list_type)             self.assertEqual(serialized_result.get(SerdesUtils.TYPE), SerdesUtils.LIST)             self.assertEqual(                 serialized_result.get(SerdesUtils.LIST_ELEMENT_TYPE), simple_string             )             self.assertEqual(                 serialized_result.get(SerdesUtils.LIST_ELEMENT_NULLABLE), False             )

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add unit tests for serializing list type

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add write map type in SerdesUtils

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add unit tests for serializing map type

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add write union type in SerdesUtils

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add unit tests for serializing union type

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add write external type in SerdesUtils

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add unit tests for serializing external type

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add write unparsed type in SerdesUtils

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
add unit tests for serializing unparsed type

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
revise TypeVar names to conform with naming patterns

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
@tsungchih tsungchih marked this pull request as ready for review April 13, 2025 05:44
@jerryshao jerryshao requested a review from xunliu April 18, 2025 09:13
@xunliu xunliu requested a review from unknowntpo April 22, 2025 03:55
@unknowntpo
Copy link
Collaborator

unknowntpo commented Apr 23, 2025

@tsungchih Could you explain why you need to implement a json_serdes util ? See what MetalakeDTO did, simply inherit fromDataClassJsonMixin in dataclasses_json library.

from dataclasses_json import DataClassJsonMixin, config

...
@dataclass
class MetalakeDTO(Metalake, DataClassJsonMixin):
     ...

Reference:

@tsungchih
Copy link
Contributor Author

tsungchih commented Apr 23, 2025

@tsungchih Could you explain why you need to implement a json_serdes util ? See what MetalakeDTO did, simply inherit fromDataClassJsonMixin in dataclasses_json library.

from dataclasses_json import DataClassJsonMixin, config

...
@dataclass
class MetalakeDTO(Metalake, DataClassJsonMixin):
     ...

Reference:

@unknowntpo Thanks for your comments.

That is because all of the current Gravitino Type defined in gravitino.api.types.types.Types, such as NullType, BooleanType, etc., do not support dataclass-json. Namely they do not inherit from DataClassJsonMixin in dataclasses_json library.

IMHO, here comes with two solutions to serialize/deserialize Type listed as follows.

  1. make all of the Types support for dataclass-json; (this will be a significant change to the interface and I'm not sure if it is acceptable)
  2. remain Types unchanged and define serializer/deserializer for them (this is the current PR)

I'm going to elaborate more by taking ShortType as an example. The current class definition is

class Types:
    class ShortType(IntegralType):
        _instance: Types.ShortType = None
        _unsigned_instance: Types.ShortType = None
        ...

We may have it to support for dataclass_json listed as follows.

class Types:
    @dataclass
    class ShortType(IntegralType, DataClassJsonMixin):
        _instance: Types.ShortType = None
        _unsigned_instance: Types.ShortType = None
        ...

Note that, however, the ShortType has two forms, signed and unsigned. We need to choose different serialize/deserialize methods for the instance of ShortType based on which form it is. I haven't figured out how to fulfill this case by simply inherit fromDataClassJsonMixin in dataclasses_json library from the documents.

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants