Skip to content

Releases: man-group/ArcticDB

v6.2.0+man1

02 Sep 08:01

Choose a tag to compare

v6.2.0+man1 Pre-release
Pre-release

🚀 Features

  • Staging and finalizing data now support transaction IDs. (#2461, #2614)
  • Example:
# Stage now returns a StageResult -- a structure that can later be passed to finalize to specify explicilty which data to finalize
stage_result1 = lib.stage("sym", df1)
stage_result2 = lib.stage("sym", df2)
lib.finalize_staged_data("sym", stage_results=[stage_result2]) -- if stage_results is omitted, compacts everything

lib.read("sym").data -- returns df2. df1 is left in the staged index for later finalization

🐛 Fixes

  • Make copy_index_key_recursively parallelizable (#2622)
  • Add docstring and warning for disparity between will_item_be_pickled and is_symbol_pickle (#2548)
  • Testing improvements
  • Remove dependency on numpy.typing (#2626) (#2628)

The wheels are on PyPI. Below are for debugging:

v6.1.2+man0

26 Aug 10:10

Choose a tag to compare

v6.1.2+man0 Pre-release
Pre-release

🚀 Features

  • Added new test utilities (#2593)

🐛 Fixes

  • Remove signaled threads on TaskScheduler destruction on Windows with timeout (#2582)
  • Fix linux debug symbols not showing (#2584)
  • GCP installation tests fail on first versions when GCP was introduced (#2592)
  • Resilient start of simulated storages (#2583)
  • Ability to turn on/off storages during local testing (#2594)
  • Added new test utilities (#2593)

The wheels are on PyPI. Below are for debugging:

v6.1.1

26 Aug 10:13

Choose a tag to compare

🐛 Fixes

  • Mem leaks - increase mem limits so that we have time to create ASV tests (#2558)
  • [Bugfix 9754509632] Fix use-after-stack-free (#2569)
  • Add missing type hints to Library and fix append return type (#2574)
  • Azure tests added (#2498)
  • Disable operation on objects of mismatched types leading to corruption [Bugfix 9754433454] (#2572)
  • Fix ASV test failures and Installation Test Failures on Master (#2578)
  • Fix test_type_promotion_int64_and_float64_up_to_float64 (#2557)
  • Fix workflow problems for Persistеnce test executions and ASV AWS S3 tests execution (#2585)
  • Remove signaled threads on TaskScheduler destruction on Windows with … (#2588)
  • Continue on error if macos wheel removal fails (#2599)

The wheels are on PyPI. Below are for debugging:

v6.1.0

20 Aug 15:31

Choose a tag to compare

This is the first externally published release from the major v6 releases. As such it includes some breaking changes to the type system

⚠️ Breaking Changes

  • Groupby using dynamic schema now produces a stable dtype. Previously, the dtype depended on the segments being processed; now the dtype will always be the same dtype able to represent the column across all segments on disk.
  • Return type for min/max aggregations in groupby for libraries using dynamic schema now is the type able to represent the column across all segments, while in earlier versions it was float64 regardless of the type of the column across different segments.
  • Using sum aggregation in groupby on a bool column now returns the count of rows containing True as a uint64.
  • Using mean aggregation in groupby on timestamp columns now returns a timestamp (instead of float64). It is computed by taking the integer part of (1/n) * Σ ts[i], for i = 1 to n
  • The return type of a projection operation involving a floating point number will always be of type float64 regardless of the types involved in the computation.
  • Performing a filter or projection on an empty DataFrame might throw an exception depending on the dtype of the columns, while older versions always returned an empty DataFrame. The dtype of the columns of an empty DataFrame depends on the version of Pandas that is used to write them. Some versions use float64 by default; others use object. Filtering like query_builder = query_builder[query_builder["col"] < 5] will throw an exception if the DataFrame is empty and the type of an empty column is object.

🚀 Features

  • Resampling for libraries that use dynamic schema
  • Add batch_delete APIs and functionality (#2463)
  • Support open ended row_range on QueryBuilder and read methods (#2550)

🐛 Fixes

  • Fix handling of segments contained only of None values in sort merge (#2536)
  • Fix OverflowError from to_json() call (#2556)
  • Storage lock increase wait time and add artificial slow writes (#2497)
  • Remove signaled threads on TaskScheduler destruction on Windows (#2544)
  • Add utilities to let the storage failure simulator simulate high latency conditions (#2407)
  • Extend testing for the index names returned by get_info (#2448)
  • Fix flaky segfault in concat testing (#2458)
  • conda-build: Use clang and clang++ 18 for osx (#2476)
  • fix: Remove use after free (#2459)
  • Fix segfault during encoding sparse data (#2475)
  • maint: Replace Folly's ranges with the standard library's (#2479)
  • conda-build: Use macos-14 (#2482)
  • Bugfix 8083916814: Respect pickle_on_failure kwarg (#2474)
  • Remove numpy pin (#2487)
  • Update_batch additional tests (#2437)
  • Make append_batch and update_batch noop with empty dataframes when there's an existing version (#2507)
  • Release the GIL when logging from the Python API (#2486)

The wheels are on PyPI. Below are for debugging:

v6.1.1+man0

19 Aug 07:30

Choose a tag to compare

v6.1.1+man0 Pre-release
Pre-release

🐛 Fixes

  • [Bugfix 9754509632] Fix use-after-stack-free (#2569)
  • Add missing type hints to Library and fix append return type (#2574)
  • Disable operation on objects of mismatched types leading to corruption [Bugfix 9754433454] (#2572)
  • Remove signaled threads on TaskScheduler destruction on Windows with … (#2588)

The wheels are on PyPI. Below are for debugging:

v6.1.0+man1

11 Aug 14:29

Choose a tag to compare

v6.1.0+man1 Pre-release
Pre-release

🚀 Features

  • Fix get_backing_store could check non-primary storage (#2534)
  • Support open ended row_range on QueryBuilder and read methods (#2550)

🐛 Fixes

  • Fix installation tests (#2549)
  • Fix handling of segments contained only of None values in sort merge (#2536)
  • Fix OverflowError from to_json() call (#2556)
  • Storage lock increase wait time and add artificial slow writes (#2497)
  • Add utilities to let the storage failure simulator simulate high latency conditions (#2407)

The wheels are on PyPI. Below are for debugging:

v6.0.0+man2

05 Aug 08:19

Choose a tag to compare

v6.0.0+man2 Pre-release
Pre-release

⚠️ Breaking Changes

Detailed description and code examples of the breaking changes can be found in (#2440)

Short summary of the breaking changes:

  • Groupby using dynamic schema now produces a stable dtype. Previously, the dtype depended on the segments being processed; now the dtype will always be the same dtype able to represent the column across all segments on disk.
  • Return type for min/max aggregations in groupby for libraries using dynamic schema now is the type able to represent the column across all segments, while in earlier versions it was float64 regardless of the type of the column across different segments.
  • Using sum aggregation in groupby on a bool column now returns the count of rows containing True as a uint64.
  • Using mean aggregation in groupby on timestamp columns now returns a timestamp (instead of float64). It is computed by taking the integer part of $$\left( \frac{1}{n} \sum_{i=1}^{n} ts[i] \right)$$
  • The return type of a projection operation involving a floating point number will always be of type float64 regardless of the types involved in the computation.
  • Performing a filter or projection on an empty DataFrame might throw an exception depending on the dtype of the columns, while older versions always returned an empty DataFrame. The dtype of the columns of an empty DataFrame depends on the version of Pandas that is used to write them. Some versions use float64 by default; others use object. Filtering like query_builder = query_builder[query_builder["col"] < 5] will throw an exception if the DataFrame is empty and the type of an empty column is object.

🚀 Features

  • Add batch_delete APIs and functionality (#2463)
  • Upgrade folly (#2502)

🐛 Fixes

  • Extend testing for the index names returned by get_info (#2448)
  • Fix flaky segfault in concat testing (#2458)
  • conda-build: Use clang and clang++ 18 for osx (#2476)
  • fix: Remove use after free (#2459)
  • Fix segfault during encoding sparse data (#2475)
  • maint: Replace Folly's ranges with the standard library's (#2479)
  • Upgrade to sparrow==1.0.0 (#2484)
  • conda-build: Use macos-14 (#2482)
  • Bugfix 8083916814: Respect pickle_on_failure kwarg (#2474)
  • Remove numpy pin (#2487)
  • fix for MAC OS - tests should not halt anymore (#2506)
  • Update_batch additional tests (#2437)
  • Make append_batch and update_batch noop with empty dataframes when there's an existing version (#2507)
  • Library tool/read segment to dataframe (#2477)
  • Release the GIL when logging from the Python API (#2486)
  • Do not crash when recursively normalizing dictionaries containing non-str keys (#2525)
  • conda-build: Remove workarounds in specification (#2512)
  • maint: Add support for libprotobuf 6 (#2455)
  • Faster and error free ASV benchmarks (#2538)
  • Fix installation tests (#2511)
  • One line change to make getting library tool for Native Mongoose libraries easier (#2541)

The wheels are on PyPI. Below are for debugging:

v6.0.0+man1

30 Jul 14:16

Choose a tag to compare

v6.0.0+man1 Pre-release
Pre-release

⚠️ Breaking Changes

  • V6.0.0 - implementation of resampling with dynamic schema and API breaking changes (#2440)

🚀 Features

  • Add batch_delete APIs and functionality (#2463)
  • Upgrade folly (#2502)

🐛 Fixes

  • Extend testing for the index names returned by get_info (#2448)

  • Fix flaky segfault in concat testing (#2458)

  • conda-build: Use clang and clang++ 18 for osx (#2476)

  • fix: Remove use after free (#2459)

  • Fix segfault during encoding sparse data (#2475)

  • maint: Replace Folly's ranges with the standard library's (#2479)

  • Upgrade to sparrow==1.0.0 (#2484)

  • conda-build: Use macos-14 (#2482)

  • Bugfix 8083916814: Respect pickle_on_failure kwarg (#2474)

  • Remove numpy pin (#2487)

  • fix for MAC OS - tests should not halt anymore (#2506)

  • Update_batch additional tests (#2437)

  • Make append_batch and update_batch noop with empty dataframes when there's an existing version (#2507)

  • Library tool/read segment to dataframe (#2477)

  • Release the GIL when logging from the Python API (#2486)

  • Rebase v6.0.0 with latest master (#2532)


The wheels are on PyPI. Below are for debugging:

v5.10.0

29 Jul 11:03

Choose a tag to compare

Performance

  • Reduce memory overhead when reading dataframes from ArcticDB by @alexowens90 in #2435

Fixes


The wheels are on PyPI. Below are for debugging:

Full Changelog: v5.9.3...v5.10.0

v5.10.0+man4

30 Jul 08:12

Choose a tag to compare

v5.10.0+man4 Pre-release
Pre-release

🐛 Fixes

  • Do not crash when recursively normalizing dictionaries containing non-str keys (#2526)

The wheels are on PyPI. Below are for debugging: