Fix convert_int_to_float in compact_incomplete V1 library API method #2235

vasil-pashov · 2025-03-12T14:27:53Z

Reference Issues/PRs

What does this implement or fix?

This fixes the behavior of convert_int_to_float exposed via the V1 Library API's compact_incomplete method.

Convert int to float will change the type of all integer columns (both signed and unsigned) to float64. In contrast to dynamic schema conversion this will static cast the values of all segments to float64 prior writing. This functionality was broken when we made the stricter type checks for staged segments.

Additional bugfix was added for dynamic schema. Dynamic schema has a different code path as it calls merge descriptors when it collects all incomplete segments thus the option was not properly applied even in arcticc. Dynamic schema used to throw exception when unsigned and signed int were mixed together (in the same column) but this should not matter as they are both going to be cast to double.

There are two main points:

columns_match was changed to take convert int to float as a parameter
merge_descriptors was changed to take convert int to float as a parameter

Any other comments?

Checklist

Checklist for code changes...

Have you updated the relevant docstrings, documentation and copyright notice?
Is this contribution tested against all ArcticDB's features?
Do all exceptions introduced raise appropriate error messages?
Are API changes highlighted in the PR description?
Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

IvoDD

Your tests pretty much cover all cases with convert_int_to_float=True.

What do you think about also adding tests that we get schema exseptions if convert_int_to_float=False? (e.g. to ensure we don't accidentally pass convert_int_to_float=True somewhere deep in C++ where we shouldn't). It is possible we already have a bunch of these tests somewhere else, haven't looked through them

alexowens90 · 2025-03-17T10:39:28Z

cpp/arcticdb/entity/merge_descriptors.cpp

-                        auto new_descriptor = has_valid_common_type(existing_type_desc, type_desc);
-                        if(new_descriptor) {
-                            merged_fields_map[field.name()] = *new_descriptor;
+                        if (convert_int_to_float && is_integer_type(existing_type_desc.data_type()) && is_integer_type(type_desc.data_type())) {


What if the existing type is float64? Or float32?

alexowens90 · 2025-03-17T10:41:05Z

python/tests/unit/arcticdb/version_store/test_parallel.py

+        assert_frame_equal(expected, lib.read(sym).data, check_dtype=True)
+
+    @pytest.mark.parametrize("dtype1", [np.int32, np.uint16, np.int8, np.int64, np.uint64, np.float64])
+    @pytest.mark.parametrize("dtype2", [np.int32, np.uint16, np.int8, np.int64, np.uint64, np.float64])


Include float32 in the tested dtypes?

alexowens90 · 2025-03-17T10:41:28Z

python/tests/unit/arcticdb/version_store/test_parallel.py

+        expected = pd.DataFrame({"a": np.arange(1, 7, dtype=np.double)}, index=pd.date_range(pd.Timestamp(0), periods=6, freq="ns"))
+        assert_frame_equal(expected, lib.read(sym).data, check_dtype=True)
+
+    @pytest.mark.parametrize("dtype2", [np.int32, np.uint16, np.int8, np.int64, np.uint64, np.float64])


Include float32?

vasil-pashov requested review from alexowens90, willdealtry and poodlewars as code owners March 12, 2025 14:27

vasil-pashov added the patch Small change, should increase patch version label Mar 12, 2025

vasil-pashov added 2 commits March 14, 2025 14:20

Fix convert int to float

f81db5e

Fix regression

23af9d8

vasil-pashov force-pushed the vasil.pashov/fix-convert-int-to-float branch from 4fa6bb9 to 23af9d8 Compare March 14, 2025 13:20

IvoDD approved these changes Mar 17, 2025

View reviewed changes

alexowens90 reviewed Mar 17, 2025

View reviewed changes

Address review comments

3d55226

vasil-pashov force-pushed the vasil.pashov/fix-convert-int-to-float branch from 37adac2 to 3d55226 Compare March 18, 2025 08:29

alexowens90 approved these changes Mar 18, 2025

View reviewed changes

vasil-pashov merged commit 24dfb54 into master Mar 18, 2025
154 checks passed

vasil-pashov deleted the vasil.pashov/fix-convert-int-to-float branch March 18, 2025 12:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix convert_int_to_float in compact_incomplete V1 library API method #2235

Fix convert_int_to_float in compact_incomplete V1 library API method #2235

Uh oh!

vasil-pashov commented Mar 12, 2025 •

edited

Loading

Uh oh!

IvoDD left a comment

Uh oh!

alexowens90 Mar 17, 2025

Uh oh!

alexowens90 Mar 17, 2025

Uh oh!

alexowens90 Mar 17, 2025

Uh oh!

Uh oh!

Uh oh!

Fix convert_int_to_float in compact_incomplete V1 library API method #2235

Fix convert_int_to_float in compact_incomplete V1 library API method #2235

Uh oh!

Conversation

vasil-pashov commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement or fix?

Any other comments?

Checklist

Uh oh!

IvoDD left a comment

Choose a reason for hiding this comment

Uh oh!

alexowens90 Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

alexowens90 Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

alexowens90 Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vasil-pashov commented Mar 12, 2025 •

edited

Loading