Releases: man-group/ArcticDB
v5.11.0+man0
🚀 Features
- Add batch_delete APIs and functionality (#2463)
🐛 Fixes
- fix: Remove use after free in allocator (#2459)
- Fix segfault during encoding sparse data (#2475)
- maint: Replace Folly's ranges with the standard library's (#2479)
- Upgrade to sparrow==1.0.0 (#2484)
- Respect pickle_on_failure kwarg (#2474)
- Remove numpy pin (#2487)
The wheels are on PyPI. Below are for debugging:
v5.9.3
v5.9.2
🐛 Fixes
- Load only required data keys on edge case values of
date_range(#2387) - Fix the
__str__for a query builder withrow_range(#2418) - Validate snapshot names (#2415). This applies our normal symbol name validation to snapshots. Snapshot names must be at most 254 chars long, in ASCII range [32, 126] and must not include * < or >
🏎️ Performance
- Improve performance of reads after a
write_and_prune_previouscall (#2460)
The wheels are on PyPI. Below are for debugging:
v5.10.0+man0
🚀 Features
- Query Builder regex filter support and upgrade PCRE to PCRE2 V2 (#2466)
- New
regex_matchAPI inQueryBuilderto support filtering string columns with regex!
Example:
>>> df = pd.DataFrame(
index=pd.date_range(pd.Timestamp(0), periods=3),
data={"a": ["abc", "abcd", "aabc"], "b": [1, 2, 3], "c": ["12a", "q34c", "567f"]}
)
>>> lib.write(SYM, df)
VersionedItem(symbol='TEST', library='test', data=n/a, version=0, metadata=None, host='S3(endpoint=s3.eu-west-1.amazonaws.com, bucket=arcticdb-ci-test-bucket-02)', timestamp=1753178167861786849)
>>> pattern = "^abc"
>>> q = QueryBuilder()
>>> q = q[q["a"].regex_match(pattern)]
>>> lib.read(SYM, query_builder=q).data
a b c
1970-01-01 abc 1 12a
1970-01-02 abcd 2 q34c
- 🚨This is a breaking change, as PCRE2 has a stricter check on the pattern🚨
- The only API that being affected is the regex filtering in
list_symbols() - The major version will not be incremented as the affected patterns are rarely used and unlikely to impact
list_symbols()functionality - For differences in details, please refer to https://stackoverflow.com/questions/70273084/regex-differences-between-pcre-and-pcre2/73767663#73767663
🐛 Fixes
- Reduce memory overhead when reading dataframes from ArcticDB (#2435)
- Minor fixes for recursive normalizers (#2451)
- Clearer exception when writing too deeply nested recursively normalized structures. The exception has type
arcticdb.exceptions.DataTooNestedException. - Validate against writing recursively normalized dicts that include __ in their key names as they do not roundtrip correctly at the moment. This exception has type
arcticdb.exceptions.UnsupportedKeyInDictionary.
- Clearer exception when writing too deeply nested recursively normalized structures. The exception has type
The wheels are on PyPI. Below are for debugging:
v5.9.2+man0
v5.9.1+man0
🐛 Fixes
-
Fix the to_string for a query builder with row_range (#2418)
-
Validate snapshot names (#2415)
-
Fix V2 API created from V1 API when use_norm_failure_handler_known_types=True (#2421)
-
Batch restore version with timestamp as_of (#2417)
-
Add comment to test_use_norm_failure_handler_known_types (#2424)
-
Schedule GCP Runs (#2416)
-
fix problem with pandas 1.0.5 and empty dataframe comparison (#2428)
-
Fix test_resample flakyness (#2427)
-
Build MacOS Apple Silicon PyPI wheels (#2335)
-
Pin numpy as on older manylinux we use it can't compile (#2438)
-
Revert PR 2378 (#2433)
The wheels are on PyPI. Below are for debugging:
v5.9.0
🚀 Features
- Enhancement 9351235967: Support projecting a column of constant values (#2405)
🐛 Fixes
- Remove arcticdb package dependencies on tests and do checks during build (#2403)
- Fix Windows 3.13 CI (#2409)
- Bugfix/9256801170/memory over allocation when reading shorts dfs (#2404)
- fixed asv benchmark.json file (#2410)
- Reimplement version map reload delete fix (#2395)
- Apply perf improvements from pydata talk (#2408)
- Validate staged data symbol names (#2414)
- Test for test_admin_tools.py::test_get_sizes_for_symbol flakyness (#2398)
- [Cherry-pick] Fix V2 API created from V1 API when use_norm_failure_handler_known_ty… (#2423)
The wheels are on PyPI. Below are for debugging:
v5.9.0+man2
🐛 Fixes
- Fix reading some existing recursively normalized data (#2433)
v5.9.0+man1
🚀 Features
- Support projecting a column of constant values (#2405)
🐛 Fixes
- Fix Windows 3.13 CI (#2409)
- Memory over allocation when reading short dfs (#2404)
- fixed asv benchmark.json file (#2410)
- Various small improvements to recursive normalizers (#2378)
- Reimplement version map reload delete fix (#2395)
- Apply performance improvements from pydata talk (#2408)
- Validate staged data symbol names (#2414)
- Fix V2 API created from V1 API when use_norm_failure_handler_known_types (#2423)
v5.8.2+man2
📣 Announcement
In this release we have dropped support for Python 3.7 (#2400). Please get in touch with us at [email protected] or in our Slack if you have any questions about this.
🐛 Fixes
- 9220057136: Allow block manager consolidation in Pandas < 2 (#2392)
- By default, Pandas will "consolidate" consecutive columns of the same dtype (e.g.
int64) from the multiple buffers returned by ArcticDB, into a single larger buffer. This can have performance benefits for short, wide dataframes. However, ArcticDB is optimised for longer, narrower dataframes, and so the performance penalty (introducing an additionalmemcpy) has historically been bypassed in ArticDB to improve reading speed. Unfortunately, we discovered recently that certain Pandas dataframe processing operations, such asreplace, have bugs when consolidation is disallowed for Pandas versions <2.0.0. As such, the consolidation bypassing behaviour is now disabled in ArcticDB when running with Pandas versions <2.0.0. The same bugs were not observed in Pandas versions >= 2.0.0, and so the behaviour in this case remains unchanged.
- By default, Pandas will "consolidate" consecutive columns of the same dtype (e.g.
💻 Internal Changes
- Upgrade GCC to 11.2 and add sparrow dependency (#2355)
The wheels are on PyPI. Below are for debugging: