You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor CollectionTest, ConfigureIndexTest, and IndexManager to improve integration test speed and reliability (#81)
## Problem
We've had continued issues with flakiness of integration tests, and
tests taking a while to run (25 - 30 minutes for the entire `pr`
workflow in some cases). A lot of this is waiting for indexes and
collections to be `Ready` instead of `Initializing`. Specifically, the
`CollectionTest` and `CollectionErrorTest` can take a really long time
due to setting up and waiting on multiple indexes and collections.
While working on the above issue, I also noticed test failures and
repeated issues due to `findIndexWithDimensionAndType`, and tests
running with indexes created by and belonging to other test runs. The
`isIndexReady` function also tends to be inefficient as the polling
isn't consistent, and we can end up waiting sometimes minutes longer
than we may need to.
With the new `Pinecone` and `Index`/`AsyncIndex` class structure in
place, we can refactor our integration tests around the new pattern.
## Solution
Ultimately, I would like to move our tests away from the
`createIndexIfNotExistsDataPlane` / `createIndexIfNotExistsControlPlane`
functions and their reliance on the `findIndexWithDimensionAndType`
function, which should be deprecated. We should come up with a pattern
for sharing resources across all tests in a run, and setting up /
tearing those down once. This can be worked on in subsequent PRs.
- Move `CollectionErrorTests` tests into `CollectionTest`. Share index
and collection setup `@BeforeAll` across collections tests. Wait less
time for the indexes created from the collection to be ready as this
specifically can take a number of minutes. Relax some of the assertions
on the `status` of created collections and indexes as I don't think we
need to be that thorough here.
- Update `createIndexIfNotExistsDataPlane` to return a tuple
(`AbstractMap.SimpleEntry<String, Pinecone>`) of the `indexName` and the
`Pinecone` client to make things a bit easier to work with.
- Update all of the `dataPlane/` tests to use
`pineconeClient.createIndexConnection()` and
`pineconeClient.createAsyncIndexConnection()` for managing data plane
operations.
- Deprecate `isIndexReady` in lieu of `waitUntilIndexIsReady`. The
polling is more consistent with this function, and it most likely saves
us time overall.
- Refactor `ConfigureIndexTest` to create its own index and clean up
after.
- Fix issue with `findIndexByDimensionAndType` calling `isIndexReady()`
on each index while iterating through the list.
- Fix issue in `IndexInterface.validateUpsertRequest` where we were
trying to call
`sparseValuesWithUnsignedIndices.getIndicesWithUnsigned32IntList()` and
`sparseValuesWithUnsignedIndices.getValuesList()` on a possible `null`
causing a `NullPointerException`.
- Talked with @ssmith-pc, and I think it's standard practice in tests to
not `try`/`catch` yourself unless you need to assert on the result. We
should be letting errors throw to the test runner and let it handle them
so we're not clobbering logs and stack traces. I've cleaned up
`try`/`catch` statements which don't seem to be needed.
- Running `gradle integrationTest --info` in `pr.yml` to get more
detailed log output in the console for better troubleshooting of ongoing
flapping.
- Adding `assertWithRetry` wrappers for specific actions which have been
troublesome. Adding `Thread.sleep()` to a few places to avoid hammering
an index too quickly / etc.
- Added a new `describeIndexStats()` overload to `IndexInterface` and
`Index` / `AsyncIndex` to allow calling without needing to explicitly
pass `null`.
I spent a lot of time running these locally and in CI to see how they
perform. Overall, it seems like these changes improve overall
reliability, although we still do see a few failures like the gRPC `no
healthy upstream` on data plane operations with fresh indexes.
The total amount of time it takes to run both sets of integration tests
has been cut significantly in most cases:
<img width="1130" alt="Screenshot 2024-03-13 at 6 09 36 PM"
src="https://github.com/pinecone-io/pinecone-java-client/assets/119623786/90f46e4a-4dfd-4638-9c17-4f6bfa9b9432">
Next steps would be to create a Junit extension and possible
`IndexManagerSingleton` to manage index resources across all tests
directly. This would help make things more predictable and reliable when
adding future tests. This would also allow us to handle concurrent
integration test runs under the same API key, which is currently very
difficult due to `findIndexByDimensionAndType`.
## Type of Change
- [X] Bug fix (non-breaking change which fixes an issue)
- [X] Infrastructure change (CI configs, etc)
## Test Plan
Run integration tests and compare over run length and
0 commit comments