Skip to content

Installation tests v3 #2330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 28 commits into
base: master
Choose a base branch
from
Draft

Installation tests v3 #2330

wants to merge 28 commits into from

Conversation

grusev
Copy link
Collaborator

@grusev grusev commented Apr 15, 2025

Reference Issues/PRs

What does this implement or fix?

Those are installation tests targeted to be executed against different versions of arcticdb from conda and pypi without building our project.

GitHubWorkflow (not yet finished)

allows matrix execution of selected combinations of OS-es and Pytho versions. For each it install artcicdb from conda(mac-os) and pypi(other os-es), along with our test dependencies. We build also protobufs and execute our tests
Not finished: ability to select arcticdb version, ability to run on demand with selected options from user

Link to successfull run with LMDB tests only:
latest 5.3.3: https://github.com/man-group/ArcticDB/actions/runs/14494364845
5.1.2: https://github.com/man-group/ArcticDB/actions/runs/14509109094/job/40703855715
4.4.7: https://github.com/man-group/ArcticDB/actions/runs/14509267618

Added:

  • @pytest.mark.installation - for marking only tests we want to execute
  • ARCTICDB_STORAGE_LMDB - env var to allow LMDB to be selected only (if all local storages are disbaled ARCTICDB_LOCAL_STORAGE_TESTS_ENABLED=0 )

Modified:

  • fixture 'lmdb_storage' is enhanced so that it can spawn other types of storages on demand, needed for resl storage installation tests where no arctic_* fixtures are available
  • FixtureMarks class - currently holding only one decorator to helo easily expand installation tests that use lmdb-* fixtures to run also against real storages

Opened Questions:

  1. The tests that we have in this PR can be executed on the master branch too. A workflow should be added for that purpose. However, if added such workflow perhaps only LMDB tests should execute, and the reason is that AWS S3 and GCP are costing quite much in that frequent execution along with a problem that they might fail due to reaching quota. Thus on master real storage tests can and should be added for execution at merge time in a separate workflow so that we keep track of failing tests and reasons.
  2. What to do when old versions have changes in API that prevent current tests to run? (From here on we can safely assume that we can checkout also the same version of sources - at least from the point this PR is merged)
  3. The tests we run against arcticdb requires whole current test framework, fixtures and its dependencies - this is too muc and could be reduced to bare minimum, See this PR: Installation tests #2316

Any other comments?

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

@grusev grusev added the patch Small change, should increase patch version label Apr 16, 2025
@grusev grusev changed the title v3 Of installation Tests Installation tests v3 Apr 16, 2025
cache-environment: true
post-cleanup: 'all'

- name: Add arcticdb from conda-forge
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to run these using the Linux Conda build too, not just Mac

Copy link
Collaborator Author

@grusev grusev Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I havethought about it too and would propose to add as one of the combination (currently we have 3 linuxes + pypi, we can make one of linuxes not pypi bu conda)

def lmdb_storage(tmp_path) -> Generator[LmdbStorageFixture, None, None]:
with LmdbStorageFixture(tmp_path) as f:
yield f
def lmdb_storage(request, tmp_path) -> Generator[LmdbStorageFixture, None, None]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be clearer to have a dedicated test suite for these installability tests because they will have to keep working with possibly very old ArcticDB versions, so will in a sense be "frozen" - whereas the core tests will change more over time.

I also don't think we should add this complexity to the core lmdb_storage fixture - a dedicated single purpose fixture would be clearer to me.

I understand why you've done it like this, it is quite appealing to just reuse existing tests, and the way you've done it is really neat, I just think it is best to be painfully obvious here (especially for the comprehensibility for new joiners etc).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have testsed so far with 4.4.7 version (lowest) and there were no problems. But I do understand what you mean. An approach with what you suggest was the original one - #2316

The idea there was to reuse as little as possible, and therefore only shared_tests.py was the thing that both current tests and new tests shared, and the idea was once a test started to diverge to be no more shared but have two versions - one for the original and one for the installation tests (it was ok since the tests files/suites were different )

@poodlewars
Copy link
Collaborator

Regarding the open questions in the description,

  1. I expect we can get most of the value by running against simulators (moto, azurite) rather than real storage backends. Seems to be the "goldilocks zone"
  2. I think this is an argument for a dedicated suite, that can cope with any API changes by testing the installed arcticdb version, rather than putting this in to the existing test suites
  3. Sounds good

Finally, it will be important to have some brief docs explaining this setup (and why we have it) for the benefit of future developers.

@poodlewars
Copy link
Collaborator

I think an important part of this will be having a way to constrain the dependencies used by earlier ArcticDB versions - we don't want tests to fail on ArcticDB vPREHISTORIC when numpy 3 is released, for example

@poodlewars
Copy link
Collaborator

Do we think it's valuable to do this on PyPi (where we bundle all our C++ deps) or perhaps only doing this on Conda would suffice?

@maxim-morozov
Copy link
Collaborator

  1. I don't know if it makes sense to reuse some of the storage tests for installation tests. I think it makes it a bit hard to make sure that we are running only the version of tests from the branch when the release was cut. I think the goal of the installation tests is to make sure that ArcticDB basic functionality is still working when it comes down to the underlying dependencies. It probably can be sufficient to create a basic set of tests like writing and reading from different storages that can be run against any ArcticDB.
  2. This is exactly my point as in 1. With this proposal, we are basically using master tests to run against previous versions of ArcticDB. ArcticDB does not ensure forward compatibility, so it can easily be broken. I think the options are:
    a. Do pure installation tests with very minimal functionality that we expect is not going to be changing, like reading and writing. Potentially have an option to specify the set of versions these tests are supposed to work with.
    b. If we want to reuse our integration tests, we should check out tests from the same branch and version ArcticDB release was cut from.
  3. Yes, this is why we need a separate test suite for installation tests. Also, installation tests should be quick and run multiple times a day as the underlying dependencies can change at any time.

name: Run Installation Tests v3

on:
push:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we are planning to run those tests on branch push. I think it's much better to run them periodically. We rarely need to update older branches, but we still need to keep the older versions workable.

steps:

- name: Checkout code
uses: actions/checkout@v3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to check out? Is it for checking out tests? If we rely on latest tests with older versions of arcticdb, it might not have the up to date functionality.

@grusev grusev mentioned this pull request Apr 23, 2025
5 tasks
grusev added a commit that referenced this pull request May 13, 2025
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->

#### What does this implement or fix?

Successful execution 5.2.6:
https://github.com/man-group/ArcticDB/actions/runs/14641126753/job/41083591802
5.1.2: https://github.com/man-group/ArcticDB/actions/runs/14637571996
4.5.1:
https://github.com/man-group/ArcticDB/actions/runs/14639124835/job/41077126258
1.6.2:
https://github.com/man-group/ArcticDB/actions/runs/14701046721/job/41250511273

The PR contains workflow definition to execute tests on installed
arcticdb it is combination of approaches:

#2330
#2316

Installation tests are now in separate folder
(python/installation_tests) not part of tests. They have their own
fixtures, making them independent from rest of code base

The tests are direct copy from originals with one modified to user ver 2
API. Otherwise now if there are changes in API each test in installation
set can be addapted. As tests run very fast no need to use simulators,
instead directly using S3 real storage

The tests are executed by a workflow. 

Currently each test is executed against LMDB and real S3. The moto
simulated version is not available in this moment due to tight coupling
with protobufs which differ for ach version as well as tight coupling
with whole existing test code.

The workflow have 2 triggers:

 - manual trigger - allowing tests to be executed manually on demand
- on schedule - the schedule execution is overnight. Each arcticdb
version tests are executed within 1hr difference from the other. Thats
is due to fact that executing all at once is likely to generate errors
with real storages


#### Any other comments?

#### Checklist

<details>
  <summary>
   Checklist for code changes...
  </summary>
 
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
 - [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>

<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->

---------

Co-authored-by: Georgi Rusev <Georgi Rusev>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
vasil-pashov added a commit that referenced this pull request May 27, 2025
commit facc33bead487490322ba9cc973ed86dc9b5c4c6
Merge: bc68ed467 85d51e3b7
Author: Vasil Danielov Pashov <[email protected]>
Date:   Tue May 27 20:15:59 2025 +0300

    Merge branch 'master' into vasil.pashov/coverity-test-existing-code-with-errors

commit bc68ed467842b510bbd7001175cc8eecefc29e1c
Merge: e68ec0146 91a076cc2
Author: Vasil Pashov <[email protected]>
Date:   Tue May 27 20:12:57 2025 +0300

    Merge branch 'master' into vasil.pashov/coverity-test-existing-file

commit 85d51e3b748982dc9121026a4dfcbd9f5a1dc2fb
Author: Alex Owens <[email protected]>
Date:   Tue May 27 10:54:08 2025 +0100

    Bugfix 9209057536: Allow concatenation of uint64 columns with int* columns (#2365)

    #### Reference Issues/PRs
    Fixes
    [9209057536](https://man312219.monday.com/boards/7852509418/pulses/9209057536)

    #### What does this implement or fix?
    Allows concatenating columns of type uint64 with columns of type int*

commit 91a076cc267caf549ff38cb532dd76c5e4e168ba
Author: Alex Owens <[email protected]>
Date:   Fri May 23 17:46:47 2025 +0100

    Enhancement 7992967434: filters and projections ternary operator (#2103)

    #### Reference Issues/PRs
    Implements
    [7992967434](https://man312219.monday.com/boards/7852509418/pulses/7992967434)

    #### What does this implement or fix?
    Implements a ternary operator equivalent to `numpy.where`, primarily for
    projecting new columns based on some condition, although it can also be
    used for filtering. Semantically the same as `left if condition else
    right`, although this Pythonic syntax cannot be made to work due to
    limitations of the language.

    #### Any other comments?
    See `test_ternary.py` for a plethora of examples and the expected
    behaviour in each case.
    Example benchmark output with annotations below.
    The first parameter to all benchmarks is the number of rows (100k for
    all of them right now), so the single-threaded per-row time can be
    calculated by dividing by 100,000.
    e.g. projecting a new column of 100k rows by choosing from 2 dense
    columns (likely a common use case) takes 424us, or just over 4ns per
    row.
    Other parameters are explained for each individual benchmark.
    ```
    Run on (20 X 2918.4 MHz CPU s)
    CPU Caches:
      L1 Data 48 KiB (x10)
      L1 Instruction 32 KiB (x10)
      L2 Unified 1280 KiB (x10)
      L3 Unified 24576 KiB (x1)
    Load Average: 4.23, 6.56, 6.73
    --------------------------------------------------------------------------------------------------
    Benchmark                                                        Time             CPU   Iterations
    --------------------------------------------------------------------------------------------------
    BM_ternary_bitset_bitset/100000                               13.1 us         13.1 us        58099
    # Second arg is whether the boolean argument is true or false, third is whether the arguments are swapped
    BM_ternary_bitset_bool/100000/1/1                             2.00 us         2.00 us       363634
    BM_ternary_bitset_bool/100000/1/0                             7.43 us         7.43 us       101700
    BM_ternary_bitset_bool/100000/0/1                             7.28 us         7.28 us        88907
    BM_ternary_bitset_bool/100000/0/0                             2.45 us         2.45 us       307832
    BM_ternary_numeric_dense_col_dense_col/100000                  424 us          424 us         1276
    BM_ternary_numeric_sparse_col_sparse_col/100000               3548 us         3548 us          185
    # Second arg is whether the arguments are swapped
    BM_ternary_numeric_dense_col_sparse_col/100000/1              2555 us         2555 us          258
    BM_ternary_numeric_dense_col_sparse_col/100000/0              2800 us         2800 us          262
    # Second arg is the number of unique strings in each string column, third is whether the columns have the same string pool or not
    BM_ternary_string_dense_col_dense_col/100000/100000/1          438 us          438 us         1534
    BM_ternary_string_dense_col_dense_col/100000/100000/0        16257 us        16258 us           43
    BM_ternary_string_dense_col_dense_col/100000/2/1               441 us          441 us         1603
    BM_ternary_string_dense_col_dense_col/100000/2/0              4219 us         4219 us          186
    BM_ternary_string_sparse_col_sparse_col/100000/100000/1       3854 us         3854 us          191
    BM_ternary_string_sparse_col_sparse_col/100000/100000/0      10753 us        10754 us           67
    BM_ternary_string_sparse_col_sparse_col/100000/2/1            3655 us         3655 us          183
    BM_ternary_string_sparse_col_sparse_col/100000/2/0            4592 us         4592 us          123
    BM_ternary_string_dense_col_sparse_col/100000/100000/1        2957 us         2957 us          236
    BM_ternary_string_dense_col_sparse_col/100000/100000/0       13980 us        13980 us           50
    BM_ternary_string_dense_col_sparse_col/100000/2/1             2967 us         2966 us          237
    BM_ternary_string_dense_col_sparse_col/100000/2/0             5179 us         5179 us          160
    # Second arg  is whether the arguments are swapped
    BM_ternary_numeric_dense_col_val/100000/1                      360 us          359 us         1871
    BM_ternary_numeric_dense_col_val/100000/0                      388 us          388 us         1692
    BM_ternary_numeric_sparse_col_val/100000/1                    2244 us         2244 us          292
    BM_ternary_numeric_sparse_col_val/100000/0                    2385 us         2385 us          283
    # Second arg  is whether the arguments are swapped, third is the number of unique strings in the column
    BM_ternary_string_dense_col_val/100000/1/100000               8259 us         8258 us           82
    BM_ternary_string_dense_col_val/100000/0/100000               7683 us         7683 us           93
    BM_ternary_string_dense_col_val/100000/1/2                    2578 us         2578 us          261
    BM_ternary_string_dense_col_val/100000/0/2                    2385 us         2385 us          297
    BM_ternary_string_sparse_col_val/100000/1/100000              6302 us         6302 us          129
    BM_ternary_string_sparse_col_val/100000/0/100000              5792 us         5792 us          115
    BM_ternary_string_sparse_col_val/100000/1/2                   2903 us         2903 us          249
    BM_ternary_string_sparse_col_val/100000/0/2                   3095 us         3095 us          232
    # Second arg  is whether the arguments are swapped
    BM_ternary_numeric_dense_col_empty/100000/1                   1269 us         1269 us          584
    BM_ternary_numeric_dense_col_empty/100000/0                   1354 us         1354 us          512
    BM_ternary_numeric_sparse_col_empty/100000/1                  1363 us         1363 us          572
    BM_ternary_numeric_sparse_col_empty/100000/0                  1374 us         1374 us          484
    # Second arg  is whether the arguments are swapped, third is the number of unique strings in the column
    BM_ternary_string_dense_col_empty/100000/1/100000             1217 us         1217 us          587
    BM_ternary_string_dense_col_empty/100000/0/100000             1343 us         1343 us          577
    BM_ternary_string_dense_col_empty/100000/1/2                  1287 us         1287 us          574
    BM_ternary_string_dense_col_empty/100000/0/2                  1363 us         1363 us          518
    BM_ternary_string_sparse_col_empty/100000/1/100000            1413 us         1413 us          524
    BM_ternary_string_sparse_col_empty/100000/0/100000            1343 us         1343 us          517
    BM_ternary_string_sparse_col_empty/100000/1/2                 1293 us         1293 us          540
    BM_ternary_string_sparse_col_empty/100000/0/2                 1235 us         1235 us          480
    BM_ternary_numeric_val_val/100000                              368 us          368 us         2039
    BM_ternary_string_val_val/100000                               376 us          376 us         1862
    # Second arg  is whether the arguments are swapped
    BM_ternary_numeric_val_empty/100000/1                         40.7 us         40.7 us        16491
    BM_ternary_numeric_val_empty/100000/0                         36.7 us         36.7 us        17836
    BM_ternary_string_val_empty/100000/1                          40.8 us         40.8 us        17892
    BM_ternary_string_val_empty/100000/0                          58.2 us         58.2 us        13825
    # Second arg is whether the left argument is true or false, third is whether the right argument is true or false
    BM_ternary_bool_bool/100000/1/1                               1.43 us         1.43 us       518204
    BM_ternary_bool_bool/100000/1/0                               1.99 us         1.99 us       378598
    BM_ternary_bool_bool/100000/0/1                               4.52 us         4.52 us       157505
    BM_ternary_bool_bool/100000/0/0                              0.020 us        0.020 us     37060921
    ```

commit 3c059f4d4030dc73594f277d8754918c698a2969
Author: Phoebus Mak <[email protected]>
Date:   Thu May 22 09:50:29 2025 +0100

    Fix gcp lib unreachable after making it read only (#2349)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->
    https://man312219.monday.com/boards/7852509418/pulses/8985074856

    #### What does this implement or fix?
    `create_store_from_lib_config` took protobuf setting only.
    GCP setting is stored natively only, unlike other storages setting.
    So when new store is created with the above function, gcp settings have
    not been passed to the new store. Therefore the SDK will fallback to
    default but incorrect setting and cause errors.

    S3 and GCPXML native settings are given default value to avoid
    uninitiailzied value being used in the test

    #### Any other comments?
    Test in the CI:
    https://github.com/man-group/ArcticDB/actions/runs/15164054821/job/42638155043
    ```
    test_symbol_list.py::test_symbol_list_read_only_compaction_needed[real_gcp_store_factory-True]
    [gw0] [ 95%] PASSED tests/integration/arcticdb/version_store/test_symbol_list.py::test_symbol_list_read_only_compaction_needed[real_gcp_store_factory-True]
    test_symbol_list.py::test_symbol_list_read_only_compaction_needed[real_gcp_store_factory-False]
    [gw0] [ 95%] PASSED tests/integration/arcticdb/version_store/test_symbol_list.py::test_symbol_list_read_only_compaction_needed[real_gcp_store_factory-False]
    ```
    (Other unrelated tests failed in the flaky real storage CI)
    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit 9d98a4436e376fa1623af92f23153cde5b68a68b
Author: Alex Owens <[email protected]>
Date:   Wed May 21 18:03:28 2025 +0100

    Fix multiindex series (#2363)

    #### What does this implement or fix?
    Fixes roundtripping of multiindexed Series with timestamps as the first
    level and strings as the second level.
    Broken by #2142

    ---------

    Co-authored-by: Alex Owens <[email protected]>

commit c3c7c2ac5d7d98d16305e6914713f03454d30a57
Author: Alex Owens <[email protected]>
Date:   Wed May 21 16:42:34 2025 +0100

    Docs 8975554293: Add concat demo notebook (#2361)

    #### Reference Issues/PRs
    Completes
    [8975554293](https://man312219.monday.com/boards/7852509418/pulses/8975554293)

    #### What does this implement or fix?
    Adds a notebook demonstrating the new `concat` functionality added in
    https://github.com/man-group/ArcticDB/pull/2142

    ---------

    Co-authored-by: Alex Owens <[email protected]>

commit 17ea0e49deba0a3a1b8e6267e9516b14ea34b3ef
Author: grusev <[email protected]>
Date:   Wed May 21 18:31:23 2025 +0300

    Update installation_tests.yml with 5.3 and 5.4 final versions (#2362)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    #### Any other comments?

    Moved 5.2.6 to different timeslot to eliminate the possibility about
    failures being because timeslot. Although a manual execution shows this
    problem with 5.2.6. is most probably persisting
    https://github.com/man-group/ArcticDB/actions/runs/15139549472/job/42559651096)

    Added:
     5.3.4 https://github.com/man-group/ArcticDB/actions/runs/15133764164/
     5.4.1 https://github.com/man-group/ArcticDB/actions/runs/15133923361

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit e68ec014683d00f095e4efbe5d72b81b7509299d
Author: Vasil Pashov <[email protected]>
Date:   Wed May 21 11:38:45 2025 +0300

    Temporary disable tests

commit e3afff2115d4f0038d13a5327a8c7b7779552a99
Merge: bdbc17028 424cd56e2
Author: Vasil Pashov <[email protected]>
Date:   Wed May 21 11:17:22 2025 +0300

    Merge branch 'master' into vasil.pashov/coverity-test-existing-file

commit 424cd56e295afafd64444420b92fcf89a82dd1ea
Author: grusev <[email protected]>
Date:   Tue May 20 11:09:42 2025 +0300

    Schedule S3 tests and fix STS to run only against AWS S3 (#2356)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    Shedule for now to run twice a week

    Contains also couple of other fixes of the workflow:
    - seeding tests were not executed previously due to change in workflow
    parameter from boolean to choice for GCP tests. Now seeding tests are
    executed.
    - STS role creation was executed for GCP tests which was unnecessary.
    Now it gets executed only with AWS S3
    - persistent tests cleaning had a problem with the context and resulted
    in crash not being able to load storage_tests.py. This test is fixed now
    to allow proper loading of mark.py in defferent contexts

    Results:
    https://github.com/man-group/ArcticDB/actions/runs/15061574677/job/42337724260
    (NOTE: the failures in the above run are because this PR:
    https://github.com/man-group/ArcticDB/pull/2353 is not part of current
    one. Once it gets merge S3 tests will run without problems)

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Georgi Rusev <Georgi Rusev>

commit a158b0c2e684c9389691744c001192ce94ddc79d
Author: Alex Owens <[email protected]>
Date:   Mon May 19 13:28:51 2025 +0100

    Bugfix 9123099670: fix resampling of old updated data (#2351)

    #### Reference Issues/PRs
    Fixes
    [9123099670](https://man312219.monday.com/boards/7852509418/views/168855452/pulses/9123099670)

    #### What does this implement or fix?
    Fixes three separate resampling bugs:

    1. Old versions of `update` (changed sometime between `4.1.0` and
    `4.4.0`, I haven't pinned down exactly where) had a behaviour in which
    the `end_index` value in the data key of the segment overlapping with
    the start of the date range provided to the `update` call was set to the
    first value of the date range in the `update` call. For all other
    modification methods, this is set to 1 nanosecond larger than the last
    index value in the contained segment. Resampling assumed this to be the
    case, and had an assertion verifying it. Relaxing this assertion is
    sufficient to fix the issue.
    2. Providing a `date_range` argument with a resample where the provided
    date range did not overlap with the timerange covered by the index of
    the symbol led to trying to reserve a vector with a negative size. This
    now correctly returns an empty result.
    3. Previously, checks that a symbol being resampled had a timestamp
    index occurred after some operations which also require this to be true,
    which could lead to the same vector reserve issue above. It is now
    checked in advance, and a suitable exception raised.

commit 9edc74a89102b4ab66fbd7911a31322425dfcacc
Author: grusev <[email protected]>
Date:   Mon May 19 12:54:07 2025 +0300

    nfs backed tests for v1 API (#2350)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    arctic_* fixtures or v2 API is already covered with nfs backed s3 tests.
    What is needed now is to add also tests for v1 API fixtures.

    New Fixtures:

    nfs_backed_s3_store_factory
    nfs_backed_s3_version_store_v1
    nfs_backed_s3_version_store_v2
    nfs_backed_s3_version_store_dynamic_schema_v1
    nfs_backed_s3_version_store_dynamic_schema_v2
    nfs_backed_s3_version_store

    Added to:

    object_store_factory
      s3_store_factory -> nfs_backed_s3_store_factory
    object_and_mem_and_lmdb_version_store
      s3_version_store_v1 -> nfs_backed_s3_version_store_v1
      s3_version_store_v2 -> nfs_backed_s3_version_store_v2
    object_and_mem_and_lmdb_version_store_dynamic_schema
    s3_version_store_dynamic_schema_v1 ->
    nfs_backed_s3_version_store_dynamic_schema_v1
    s3_version_store_dynamic_schema_v2 ->
    nfs_backed_s3_version_store_dynamic_schema_v2

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Georgi Rusev <Georgi Rusev>

commit 67d2bbe530f96a0aa5412f479e123da480ba2d99
Author: Alex Owens <[email protected]>
Date:   Fri May 16 15:20:37 2025 +0100

    Enhancement 8277989680: symbol concatenation poc (#2142)

    #### Reference Issues/PRs
    8277989680

    #### What does this implement or fix?
    Implements symbol concatenation. Inner and outer joins over columns both
    supported. Expected usage:
    ```
    # Read requests can contain usual as_of, date_range, columns, etc arguments
    lazy_dfs = lib.read_batch([read_request_1, read_request_2, ...])
    # Potentially apply some processing to all or individual constituent lazy dataframes here, that will be applied before the join
    lazy_dfs = lazy_dfs[lazy_dfs["col"].notnull()]
    # Join here
    lazy_df = adb.concat(lazy_dfs)
    # Perform more processing if desired
    lazy_df = lazy_df.resample("15min").agg({"col": "mean"})
    # Collect result
    res = lazy_df.collect()
    # res contains a list of VersionedItems from the consituent symbols that went into the join with data=None, and a data member with the joined Series/DataFrame
    ```
    See `test_symbol_concatenation.py` for thorough examples of how the API
    works.
    For outer joins, if a column is not present in one of the input symbols,
    then the same type-specific behaviour as used for dynamic schema is used
    to backfill the missing values.
    Not all symbols can be concatenated together. The following will throw
    exceptions if attempted to be concatenated:

    - a Series with a DataFrame
    - Different index types, including multiindexes with different numbers
    of levels
    - Incompatible column types. e.g. if `col` has type `INT64` in one
    symbol, and is a string column in another symbol. this only applies if
    the column would be in the result, which is always the case for all
    columns with an outer join, but may not always be for inner joins.

    Where possible, the implementation is permissive with what can be joined
    with an output as sensible as possible:

    - Joining two or more Series with different names that are otherwise
    compatible will produce a Series with no name
    - Joining two or more timeseries where the indexes have different names
    will produce a timeseries with an unnamed index
    - Joining two or more timeseries where the indexes have different
    timezones will produce a timeseries with a UTC index
    - Joining two or more multiindexed Series/DataFrames where the levels
    have compatible types but different names will produce a multiindexed
    Series/DataFrame with unnamed levels where they differed between some of
    the inputs.
    - Joining two or more Series/DataFrames that all have `RangeIndex`. If
    the index `step` does not match between all of the inputs, then the
    output will have a `RangeIndex` with `start=0` and `step=1`. **This is
    different behaviour to Pandas, which converts to an Int64 index in this
    case. For this reason, a warning is logged when this happens.**

    The only known major limitation is that all of the symbols being joined
    together (after any pre-join processing) must fit into memory. Relaxing
    this constraint would require much more sophisticated query planning
    than we currently support, in which all of the clauses both for
    individual symbols pre-join, the join, and any post-join clauses, are
    all taken into account when scheduling both IO and individual processing
    tasks.

commit c1c7a8cff3193dcf4aefee268cd3feea01c68bd9
Author: grusev <[email protected]>
Date:   Fri May 16 13:55:12 2025 +0300

    Patch for Real S3 library names (#2353)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    Currently we create library names which are too long for real S3, this
    is a patch for the tests until the real bug is addressed

    Manually triggered run:
    https://github.com/man-group/ArcticDB/actions/runs/15013824867

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Georgi Rusev <Georgi Rusev>

commit bb65a85ab82dd7fec5297b258956545f8b4adea7
Author: Alex Owens <[email protected]>
Date:   Fri May 16 11:41:18 2025 +0100

    Add resolve_defaults back in as a static method of NativeVersionStore (#2358)

    #### Reference Issues/PRs
    Was removed in #2345 , but is needed at least by some internal tests,
    and technically constitutes an API break (although we don't expect
    anybody to be using it)

commit e78758a7fe5fbb02085dcfae01218903d6dad6d9
Author: grusev <[email protected]>
Date:   Fri May 16 13:25:24 2025 +0300

    Installation Tests Workflow Fixes (#2354)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    A failure when job is triggered on schedule is fixed - the string
    containe extra single quotes. Also the order of 2 steps is changed for
    schedulling specific use case.

    Changes in workflow dispatch are implemented to simplify execution and
    leave some parts for enhancements - ie the selection of exact
    os-python-repo combination which needs actually single flow of step and
    not matrix.

    S3 tests also enabled to run along with LMDB test by default

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Georgi Rusev <Georgi Rusev>

commit 9e544da9d823c3a4e76b256b741925af52a20742
Author: grusev <[email protected]>
Date:   Tue May 13 13:45:53 2025 +0300

    Installation tests v4 (#2339)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    Successful execution 5.2.6:
    https://github.com/man-group/ArcticDB/actions/runs/14641126753/job/41083591802
    5.1.2: https://github.com/man-group/ArcticDB/actions/runs/14637571996
    4.5.1:
    https://github.com/man-group/ArcticDB/actions/runs/14639124835/job/41077126258
    1.6.2:
    https://github.com/man-group/ArcticDB/actions/runs/14701046721/job/41250511273

    The PR contains workflow definition to execute tests on installed
    arcticdb it is combination of approaches:

    https://github.com/man-group/ArcticDB/pull/2330
    https://github.com/man-group/ArcticDB/pull/2316

    Installation tests are now in separate folder
    (python/installation_tests) not part of tests. They have their own
    fixtures, making them independent from rest of code base

    The tests are direct copy from originals with one modified to user ver 2
    API. Otherwise now if there are changes in API each test in installation
    set can be addapted. As tests run very fast no need to use simulators,
    instead directly using S3 real storage

    The tests are executed by a workflow.

    Currently each test is executed against LMDB and real S3. The moto
    simulated version is not available in this moment due to tight coupling
    with protobufs which differ for ach version as well as tight coupling
    with whole existing test code.

    The workflow have 2 triggers:

     - manual trigger - allowing tests to be executed manually on demand
    - on schedule - the schedule execution is overnight. Each arcticdb
    version tests are executed within 1hr difference from the other. Thats
    is due to fact that executing all at once is likely to generate errors
    with real storages

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Georgi Rusev <Georgi Rusev>
    Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

commit 2612fb45f15350dc483ddde1c8d43c2d6a02731b
Author: grusev <[email protected]>
Date:   Mon May 12 15:39:20 2025 +0300

    Asv v2 s3 tests (Refactored) (#2249)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    Contains refactored framework for setting up shared storages + tests for
    AWS S3 storage

    Merged 3 Prs into one:
      - https://github.com/man-group/ArcticDB/pull/2185
      - https://github.com/man-group/ArcticDB/pull/2227
      - https://github.com/man-group/ArcticDB/pull/2204

    Important: the benchmark tests down in this PR cannot run successfully.
    Therefore do not take them as criteria. All tests need to be run
    manually. Here are runs from 27-march:
    LMDB set:
    https://github.com/man-group/ArcticDB/actions/runs/14100376040/job/39495398374
    Real set:
    https://github.com/man-group/ArcticDB/actions/runs/14100497273/job/39495728734

    #### What does this implement or fix?

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    Co-authored-by: Georgi Rusev <Georgi Rusev>

commit 3c2fe145cad45797356a4ec5fbd42e4dac57681a
Author: William Dealtry <[email protected]>
Date:   Mon May 12 09:57:15 2025 +0100

    size_t size in MacOS

commit bb54de8879ab57c37093a62c5282e405fc9a834b
Author: William Dealtry <[email protected]>
Date:   Mon May 12 09:03:04 2025 +0100

    resolve defaults is a free function

commit e973f8dbd898aedc747bc232e022c9a1137d882c
Author: willdealtry <[email protected]>
Date:   Wed Apr 16 14:49:46 2025 +0100

    Fix up file operations

commit af1a171eab284902db4333946b732de7d9ec2b18
Author: Phoebus Mak <[email protected]>
Date:   Mon May 12 10:00:32 2025 +0100

    Disable s3 checksumming (#2337)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->
    https://github.com/man-group/ArcticDB/issues/2251
    #### What does this implement or fix?
    Disable s3 checksumming by setting environment variable in the wheel.

    #### Any other comments?
    This will also unblock the upgrade of `aws-sdk-cpp` on vcpkg.
    The upgrade will not be made in this PR

    One of the newly added test is needed to be skipped as `conda` CI has
    `aws-sdk-cpp` pinned at non-s3-checksumming version due the `libarrow`
    pin.
    `environment-dev.yml` doesn't align with the counterpart in the
    feedstock. Therefore the new version of `aws-sdk-cpp` is only used in
    the feedstock thus release wheel but not in local and CI build here.
    This will be addressed in separate ticket.

    [Commit](https://github.com/man-group/ArcticDB/pull/2337/commits/245a02cd455e39fb8f976301ccd5409e6ae88b13)
    to remove `libarrow` pin so more updated `aws-sdk-cpp`, which support s3
    checksumming is in used in conda
    It's for verifying the change with the newly added the test. The
    [test](https://github.com/man-group/ArcticDB/actions/runs/14732394443/job/41349695905)
    is successful.

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit b808afac25bed84595b874f28b6b3ce2407fbd0c
Author: grusev <[email protected]>
Date:   Fri May 9 15:46:17 2025 +0300

    Delete STS roles regularly  (#2344)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    Due to limitation of STS roles number we should constantly do cleaning
    of failed to delete roles. The PR contains a scheduled job that would do
    that every Sa. The python script can also be executed at any time and
    will delete only roles created prior of today, leaving all currently
    running jobs unaffected

    As roles cannot be guaranteed to be cleaned after tests execution due to
    many factors, we should take them out on regular bases, and perhaps this
    is the quickest and most reliable approach

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Georgi Rusev <Georgi Rusev>

commit 0136f4ca52559e0640dc1b7518d6a8b0773ed3a8
Author: Ognyan Stoimenov <[email protected]>
Date:   Fri May 9 14:36:54 2025 +0300

    Fix permissions for the automatic docs building (#2347)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?
    Fixes failures when building the docs automatically on release like:
    https://github.com/man-group/ArcticDB/actions/runs/14832306883
    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit 652d968561d473599e90508078005c4fd00a1ba4
Author: Phoebus Mak <[email protected]>
Date:   Sat May 3 02:03:44 2025 +0100

    Query Stat framework v3 (#2304)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?
    New query stat implemenation which its schema is static
    The feature of linking arcticdb API calls to storage operations has been
    dropped. Now only storage operation stats will be logged. Therefore the
    schema of the stats is hardcoded and allow the summation of stats is
    logged, one statical object with numerous atomic ints is enough to do
    the job.
    No fancy map nor modification of folly executor.

    #### Any other comments?
    Sample output:
    ```
    { // Stats
            "SYMBOL_LIST":  // std::array<std::array<OpStats, NUMBER_OF_TASK_TYPES>, NUMBER_OF_KEYS>
             {
                "storage_ops": {
                    "S3_ListObjectsV2":
                    { // OpStats
                        "result_count": 1,
                        "total_time_ms": 34
                    }
                }
            }
        }
    ```

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit 9b93303adf8d5c436ae267be4d950fc5e55139de
Author: Vasil Danielov Pashov <[email protected]>
Date:   Fri May 2 17:29:18 2025 +0300

    Hold the GIL when incrementing None's refcount to prevent race conditions when there are multiple Python threads (#2334)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->
    None is a global static object in Python which is also refcounted. When
    ArcticDB creates `None` objects it must increase their refcount. It must
    acquire the GIL when the refcount is increased. Currently we don't
    acquire the GIL when we do this, we only hold a SpinLock protecting
    other ArcticDB threads from racing on the GIL refcount. With this change
    we add an atomic variable in the PythonHandler data which will
    accumulate the refcount. Then at the end of the operation when we
    reacquire the GIL we will increase the refcount. The same is done for
    the NaN refcount, note that we don't really need the GIL to increase
    NaN's refcount as we create it internally and don't handle it to Python
    until the read operation is done. Currently only read operations need to
    work with the `None` object.

    `apply_global_refcounts` must be called at the very end before passing
    the dataframe to python to prevent something raising an exception in
    after the refcount is applied but before python receives the data.
    Increasing None's refcount but never decreasing it doesn't seem to be
    fatal but we're trying to be good citizens. The best place for that is
    `adapt_read_df` or `adapt_read_dfs` as they are called at the end of all
    read functions. The code is changed so that the type handler data is
    created always in the python bindings file as it's easier to track.
    #### What does this implement or fix?

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Vasil Pashov <[email protected]>

commit d4b40e287863960d608d52131471a88a435bf844
Author: Phoebus Mak <[email protected]>
Date:   Fri May 2 11:13:30 2025 +0100

    Update docs for sts ca issue (#2265)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?
    Clarify when does the workaround need for STS CA issue

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit a9d0e41e47c40a34e2e146a4297b5c638375fe85
Author: Phoebus Mak <[email protected]>
Date:   Tue Apr 29 17:44:08 2025 +0100

    Skip azurite api check (#2288)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?
    The api check in Azurite has brought pain to local tests as the azurite
    version needs to keep up with the SDK version. We are only using very
    simple API so safe to skip the check.

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit 550d3e7c29a5f9d67a0e993bbabc1cbf88295ef1
Author: grusev <[email protected]>
Date:   Thu Apr 24 17:45:21 2025 +0300

    initial version fix for GCP (#2326)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Georgi Rusev <Georgi Rusev>

commit 41a2086963e018ffe0ac90e6fea72d3577d463f3
Author: Alex Owens <[email protected]>
Date:   Wed Apr 23 12:31:26 2025 +0100

    Timeseries defrag function (#2319)

    #### What does this implement or fix?
    Adds a (private) function to defragment timeseries data. See big list of
    caveats in code comments for limitations

commit 61b00e99ce7861a0fd767572be0d58600c065b53
Author: Vasil Danielov Pashov <[email protected]>
Date:   Thu Apr 17 16:04:41 2025 +0300

    Fix race conditions on the None object refcount during a multithreaded read (#2320)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?
    **Bugfix**
    Columns are handled in multiple threads during read calls. String
    columns can contain `None` values. `None` is a global static ref counted
    object and the refcount is not atomic. When ArcticDB places `None`
    objects in columns it must increment the refcount. Currently None
    objects are allocated only via type handlers. ArcticDB has a global
    spin-lock that is shared by all type-handlers. The bug is caused by
    [this
    line](https://github.com/man-group/ArcticDB/blob/300e121e1be47ecfbabba78f077851a9c3b0772c/cpp/arcticdb/python/python_utils.hpp#L117)
    the spin-lock is wrapped in a `std::lock_guard` but there is a call to
    `unlock`. When `unlock` is called another thread will take the lock and
    start calling `Py_INCREF(Py_None)` but when the function exists the
    `std::scope_guard` will call unlock again allowing another thread to
    start calling `Py_INCREF(Py_None)` in parallel.

    **Refactoring**
    - Remove GIL safe py none. It was created because pybind11 wraps
    `Py_None` in an object and calls `Py_INCREF(Py_None)` and we must hold
    the GIL when incrementing the refcount. The wrapper we have was used
    only to get the pointer to the `Py_None` object. We don't need pybind11
    to do that. Using the C API we can directly get `Py_None` which is
    global object
    - Add function to check if a python object is `None`
    - Remove uses of py::none{} in places where we don't hold the GIL (most
    of those were just to get the `Py_None` object that's inside `py:none`

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    ---------

    Co-authored-by: Vasil Pashov <[email protected]>

commit 396757028afbd460fd6325fd2403636ed8482d56
Author: Julien Jerphanion <[email protected]>
Date:   Thu Apr 17 11:39:55 2025 +0200

    Support MSVC 19.29 (#2332)

    Signed-off-by: Julien Jerphanion <[email protected]>

commit b89fc53dbd7cd1eee783fed1fba7b401d69b6ffd
Author: Georgi Petrov <[email protected]>
Date:   Wed Apr 16 15:35:56 2025 +0300

    Increase tolerance to arithmetic mismatches with Pandas with floats (#2333)

    #### Reference Issues/PRs

    https://github.com/man-group/ArcticDB/actions/runs/14487537861/job/40636907727?pr=2331

    #### What does this implement or fix?
    To resolve this type of flakiness:

    ``` python
    FAILED tests/hypothesis/arcticdb/test_resample.py::test_resample - AssertionError: Series are different

    Series values are different (100.0 %)
    [index]: [1969-12-31T23:59:01.000000000]
    [left]:  [-1706666.6666666667]
    [right]: [-1706325.3333333333]
    At positional index 0, first diff: -1706666.6666666667 != -1706325.3333333333
    Falsifying example: test_resample(
        df=
                                           col_float              col_int  col_uint
            1970-01-01 00:00:00.000000000        0.0  9223372036849590785         0
            1970-01-01 00:00:00.000000001        0.0                  512         0
            1970-01-01 00:00:00.000000002        0.0 -9223372036854710785         0
        ,
        rule='1min',
        origin='start',
        offset='1s',
    )

    You can reproduce this example by temporarily adding @reproduce_failure('6.72.4', b'AXicY2RgYGQAYxCCUEwMyAAkzVD/Hwg2PGIEq2ACqgASjBDR/0yMMFUwAAB9FAui') as a decorator on your test case
    ```

    #### Any other comments?
    A similar fix was done here:
    https://github.com/man-group/ArcticDB/commit/fe9de294580526e921102fbdedda736f20596fc7

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit 30f4c48db0d742898f629d129b5d1caa83091662
Author: Alex Seaton <[email protected]>
Date:   Wed Apr 16 13:08:30 2025 +0100

    Symbol sizes API (#2266)

    Add Python APIs to get sizes of symbols, in a new `AdminTools` class.
    Add documentation for this feature to our website.

    You can access the new tools with:

    ```
    lib: Library
    lib.admin_tools(): AdminTools
    ```

    Refactor the existing symbol scanning APIs to a visitor pattern so they
    can all share as much of the implementation as possible.

    Monday: 8560764974

commit 6b3c593924808d33a39e275f921f613f77139d06
Author: Georgi Petrov <[email protected]>
Date:   Wed Apr 16 14:32:57 2025 +0300

    Prevent exceptions in ReliableStorageLockGuard destructor (#2331)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?
    Sometimes when trying to release the lock, there could be exceptions
    that occur (either storage related or others).
    This PR is trying to catch all exceptions, mainly to prevent unnecessary
    seg faults in enterprise.

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit aa585fc0a5ae60f61f1752d78614e0951047d21e
Author: Julien Jerphanion <[email protected]>
Date:   Wed Apr 16 10:10:11 2025 +0200

    conda-build: Extend development environment for Windows (#2328)

    #### Reference Issues/PRs

    Extracted from https://github.com/man-group/ArcticDB/pull/2252.

    #### What does this implement or fix?

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    Signed-off-by: Julien Jerphanion <[email protected]>

commit 42091dbe1ea4b7b827cad4f53b2ef099eb43b4fb
Author: Ognyan Stoimenov <[email protected]>
Date:   Tue Apr 15 18:13:47 2025 +0300

    Fix pr getting action (#2323)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?
    https://github.com/VanOns/get-merged-pull-requests-action was updated to
    fix some issues but changes its API
    * Accommodate new API
    * Remove previous workaround (now fixed)
    * Pin action to 1.3.0 so no such breaks happen in the future
    * Changelog generator was not skipping release candidates when comparing
    version. Fixed now
    * Fix docs building permission

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit 311c1bf8099a491bf1dd85c09e83d640f9d6ce74
Author: Julien Jerphanion <[email protected]>
Date:   Tue Apr 15 17:13:05 2025 +0200

    ci: Benchmark workflow adaptations (#2327)

    #### Reference Issues/PRs

    #### What does this implement or fix?

    Fixes the import error, working around
    https://github.com/airspeed-velocity/asv/issues/1465.

    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

    Signed-off-by: Julien Jerphanion <[email protected]>

commit 7b37536b67b8410d2d890b8ee8bf38b05181aa61
Author: Vasil Danielov Pashov <[email protected]>
Date:   Tue Apr 15 11:25:03 2025 +0300

    Refactor to_atom and to_ref to properly use forwarding references (#2321)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?
    This solves two problems
    - Code duplication. to_atom had 3 overloads for value/ref/rval ref for
    the same thing. Forwarding references were invented to solve this
    problem.
    - There were unnecessary copies. `to_atom` had an overload taking
    `VeriantKey` by value at some point some APIs have changed and started
    returning `AtomKey` instead of `VariantKey` due to the excessive use of
    `auto` nobody noticed the difference. Thus we ended up with calling
    `to_atom` on an atom key, that worked because `VariantKey` can be
    constructed from an `AtomKey` implicitly thus we ended up constructing
    `VariantKey` from an `AtomKey` only to extract the `AtomKey` from that.
    Forwarding references do not allow implicit conversions thus the
    compiler pointed out all places in the code where the above happens.
    #### Any other comments?

    #### Checklist

    <details>
      <summary>
       Checklist for code changes...
      </summary>

    - [ ] Have you updated the relevant docstrings, documentation and
    copyright notice?
    - [ ] Is this contribution tested against [all ArcticDB's
    features](../docs/mkdocs/docs/technical/contributing.md)?
    - [ ] Do all exceptions introduced raise appropriate [error
    messages](https://docs.arcticdb.io/error_messages/)?
     - [ ] Are API changes highlighted in the PR description?
    - [ ] Is the PR labelled as enhancement or bug so it appears in
    autogenerated release notes?
    </details>

    <!--
    Thanks for contributing a Pull Request to ArcticDB! Please ensure you
    have taken a look at:
    - ArcticDB's Code of Conduct:
    https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
    - ArcticDB's Contribution Licensing:
    https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
    -->

commit 300e121e1be47ecfbabba78f077851a9c3b0772c
Author: grusev <[email protected]>
Date:   Fri Apr 11 14:07:36 2025 +0300

    Update s3.py moto*.create_fixture - add retry attempts (#2311)

    #### Reference Issues/PRs
    <!--Example: Fixes #1234. See also #3456.-->

    #### What does this implement or fix?

    Addresses couple of flaky tests opened due to NFS or S3…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patch Small change, should increase patch version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants