Skip to content

Releases: man-group/ArcticDB

v1.6.0

25 Jul 12:01

Choose a tag to compare

⚠️ API Changes

  • Modify read_batch to return a DataError object rather than raising an exception (#629)

For Library.read_batch, a DataError object will now be returned in the position in the list returned corresponding to the symbol/version pair there was an issue retrieving. This contains the symbol and version requested, and the exception string thrown. For two well-defined categories of error, the error category and specific error code are also included:

ErrorCategory.MISSING_DATA - ErrorCode.E_NO_SUCH_VERSION: The version requested by the user does not exist
ErrorCategory.STORAGE - ErrorCode.E_KEY_NOT_FOUND: At least one of the keys required to read the specified version does not exist in the storage.

Otherwise these fields of the DataError object are left as None.

See also the API docs.

  • Accessing a Library that does not exist now throws an arcticdb.exceptions.LibraryNotFound rather than an internal exception.

🚀 Features

  • Support negative integers in as_of (#589)

This PR enables negative indexing to select "the last-but-nth version". This is equivalent to negative indexing in Python. See API docs.

For example:

read(...as_of=0...) will select the first version
read(...as_of=-1...) will select the most recent version
read(...as_of=-2...) will select the version prior to the most recent version

🐛 Fixes

Various fixes to make our use of LMDB more correct. These only affect LMDB-backed Arctic instances (not S3). These should resolve the segfaults users have been experiencing with LMDB.

  • Only open library once in the interpreter lifetime (#585)
  • Fix for LMDB DBI Handling (#597)
  • Delete relevant part of LMDB tree upon library delete (#601)

Also,

  • Fix for write batch with dedup (#595)
  • Fix for read batch when as of is TimeStamp (#617)
  • Fix numeric isin filtering for some cases with a mix of signed and unsigned integers (#604)
Uncategorized
  • Update demo notebook (#570)
  • fix: Remove uneeded FMT_COMPILE (#578)
  • build: Use Pandas 2.0 forward compatible API (#582)
  • Make batch tests much faster (#586)
  • Remove tests skips in test_storage_lock.cpp for Windows (#550)
  • Update python line in issue-template so it works on windows (#608)
  • Update faq.md (#616)
  • Windows is Beta (#610)
  • Rename lib to be consistent - lib vs library (#622)
  • Update docs to point how to use AWS_PROFILE (#619)
  • maint: Remove VariantStorageFactory and its implementations (#625)
  • docs: Add development guidelines for testing combinations (#644)
  • conda-build: Pin pybind11 to < 2.11 (#647)
  • Add checklist to pull request template (#643)

The wheels are on Pypi. Below are for debugging:

v1.5.0

11 Jul 16:18

Choose a tag to compare

🚀 Features

  • ☁️ ArcticDB now supports Azure Blob Storage! (#427 #464) ☁️.
    • Note: This does functionality is not yet available in the Conda release of ArcticDB - it is only available in the PyPI release.
  • Performance improvements for:
    • write_batch (#467)
  • Added optional library encoding option (#401)
  • Add environment variable controlling whether AWS S3 should verify SSL certificates (S3Storage.VerifySSL). This defaults to 1 - set to 0 to disable. (#553)
  • Specify compilations optimizations for Windows build (#543)
  • Improve documentation for Arctic 1.0 -> ArcticDB migration (#546)
  • Add test for filtering down the string pool with Nones and NaNs (#533)
  • ArcticDB now supports logging to a file for log messages (#573)

🐛 Fixes

  • Rename write_batch_pickle to write_pickle_batch. Note that this is a breaking API change. (#516)
  • Fix compilation errors issued by clang-16. (#542)
  • Fix for read batch when filtering by column. Previously this would raise an error - it now works. (#567)
  • Suppress Pandas warning about nanoseconds being discarded (#571)

The wheels are on Pypi. Below are for debugging:

v1.4.1

27 Jun 16:55

Choose a tag to compare

🚀 Features

  • Significant performance improvements for get_description_batch, exploiting per-symbol parallelism. Bench-marking suggests an order of magnitude improvement for 1000 symbols.

🐛 Fixes

  • get_description_batch datetimes now include UTC tzinfo (fixing the batch equivalent of issue #197).

The wheels are on Pypi. Below are for debugging:

v1.4.0

23 Jun 12:25

Choose a tag to compare

📣 Notices 📣

1.4.0 has changed the default value of prune_previous_versions.

Prior to 1.4.0, if you did not pass a specific value into prune_previous_versions, prior versions would have been removed after successful completion of the write, update or append operation. By changing the default value, previous versions will now be kept by default.

To maintain the behaviour of previous releases, pass prune_previous_versions=True or manually call prune_previous_versions.

🚀 Features

  • Support an option to allow Arctic to override storage endpoint and credentials. Useful if replicating a bucket containing existing ArcticDB libraries to another region. (#502)

Other Changes:

  • If list_symbols is expected to be slow (no recent cache), a warning will be printed. (#489)
  • Major refactor to the analytical engine of ArcticDB (#471)
  • prune_previous_versions has been set to False by default (#485)
  • Fix logging levels not being configurable from env var (#490)
  • Add UTC timezone info to dates returned by SymbolDescription::get_description (#480)
  • Use RFC 3986 URL encoding for S3 interactions (#503)
Uncategorized
  • Demo notebook (#477)
  • conda-build: Use fmt < 10 (#479)
  • Add guide on how to release ArcticDB to PyPi and conda-forge (#478)
  • Try removing enum-compat (#491)
  • Address post merging comments from prune_previous_versions set defaul… (#487)
  • Add requirements file (#501)
  • Remove reference to Arcticc and fix file config (#508)

The wheels are on Pypi. Below are for debugging:

Full Changelog: v1.3.0...v1.4.0

v1.3.0

12 Jun 08:08

Choose a tag to compare

🍎 New Platforms

  • This release coincides with the release of 1.3.0 on MacOS Apple Silicon on conda-forge!

🚀 Features

  • Multiple improvements to the reliability and stability of the symbol list caching logic. (#393)
  • Add get_uri function to the Arctic instance (#433)

🐛 Fixes

  • Significant performance improvements for read_batch, read_metadata_batch and get_description_batch (#415 #435)
    • Note that the performance improvements for read_batch will largely only be visible if no QueryBuilder query is passed in.
  • fast_tombstone_all option has been removed - in code this is assumed to always be true. (#366)
  • Simplify log config and fix #426 #453 (#474)
Uncategorized
  • rename isin test to avoid being skipped (#418)
  • github: Adapt bug report template (#429)
  • github: switching action for micromamba (#424)
  • #146 Add Black formatting to build (#431)
  • Higher utility GitHub Actions changes (#438)
  • fix black formatting issues previous PR (#444)
  • dev: Add simple pre-commit setup (#445)
  • build: Do not build with Remotery by default (#466)
  • follow-up: dev: Add simple pre-commit setup (#448)
  • cmake: Improve the resolution of LZ4 (#451)
  • fix: Required adaptations for MSVC 14.29.30133 support (#459)

The wheels are on Pypi. Below are for debugging:

v1.2.1

31 May 14:49

Choose a tag to compare

This is a bugfix release:

  • #419 #409 Build fixes to enable the Conda build (and therefore publishing to conda-forge)
  • #388 ArcticDB will now allow updates on data if the existing data was written prior to v1.1 regardless of whether that existing data is sorted or not

Full Changelog: v1.2.0...v1.2.1

ArcticDB distributes:

  • wheels for Windows and Linux on PyPI, the latest being installable with:
    pip install --update arcticdb
  • conda packages for MacOS and Linux on conda-forge, the latest being installable with:
    mamba install -c conda-forge --update arcticdb

Below are artifacts for debugging:

v1.2.0

22 May 16:39

Choose a tag to compare

🚀 Features

defragment_symbol_data  method added (#180)

  • This will defragment fragmented symbols. Fragmented symbols are typically caused by frequent small appends (e.g. 1 row every hour) and result in sub-optimal read performance. Reading defragmented symbols can result in significantly improved performance compared to a more fragmented equivalent!

✅ Write API for column-level statistics added. (#253)

  • The structures created using this API will later be used to improve the performance of queries in later versions. As of version 1.2.0 however, the structures are not consumed at query time.

✅ Configuration variables set via environment variables are now case insensitive (#364)

🐛 Bug fixes

add_to_snapshot  and remove_from_snapshot  no longer leave unreachable data on disk if a symbol-version pair being removed from a snapshot is the last reference to this version (#253)

✅ Reads with QueryBuilder parameters that contain a groupby/aggregation  clauses now only fetch data segments containing columns relevant to the query (#253)

Full Changelog: v1.1.0...v1.2.0

The wheels are on Pypi. Below are for debugging:

v1.1.0

10 May 12:12

Choose a tag to compare

New Platforms

✅ Version 1.1 includes the first official release of the Windows version of ArcticDB!
Versions 1.1 includes the following caveats:

  • Please have a recent Visual C++ Redistributable for Visual Studio 2015-2022 installed.
  • Windows does not support fixed-width NumPy strings and so will not be able to write fixed-width strings contained within a NumPy array.
  • Writing pickled data can fail on machines with limited memory, see #340 for details and a workaround.
  • The LMDB store on Windows pre-allocates the max storage size on disk, so we limited the size to 128MB. We will provide an option to increase the size in the future (#229).

✅ Version 1.1 now supports Python 3.11!

New Features

✅ batch_get_descriptor method added (#219)
✅ batch_get_metadata method added (#220)
Both batch methods do not achieve optimal parallelism - we are aiming to address in 1.2.

Changes

✅ 1.1 pins to Pandas < 2.0 whilst we address a few remaining compatibility issues 🐼 (#237)
✅ 1.1 supports Protobuf V3 as well as V4
✅ Out-of-order data updates now raises exceptions should they prevent future indexed reads (#203)
✅ PCRE is now statically linked (#321)
stream id which is a confusing term has been renamed to symbol in error messages and should be better understood (#311)
✅ Pickled data supports up-to 4GB (#260)
✅ Debugging toolbox added (#209)
✅ Introduced a base exception type arcticdb.exceptions.ArcticException. Exceptions are exposed in arcticdb.exceptions module with the following hierarchy:

RuntimeError
└-- ArcticException
    |-- ArcticNativeNotYetImplemented
    |-- DuplicateKeyException
    |-- MissingDataException
    |-- NoDataFoundException
    |-- NoSuchVersionException
    |-- NormalizationException
    |-- PermissionException
    |-- SchemaException
    |-- SortingException
    |   └-- UnsortedDataException
    |-- StorageException
    |-- StreamDescriptorMismatch
    └-- InternalException

Next version:

NEW FEATURE: defragment_symbol_data method added (#180)

The wheels are on Pypi. Below are for debugging: