Skip to content

Commit c79c2ee

Browse files
Refactor CollectionTest, ConfigureIndexTest, and IndexManager to improve integration test speed and reliability (#81)
## Problem We've had continued issues with flakiness of integration tests, and tests taking a while to run (25 - 30 minutes for the entire `pr` workflow in some cases). A lot of this is waiting for indexes and collections to be `Ready` instead of `Initializing`. Specifically, the `CollectionTest` and `CollectionErrorTest` can take a really long time due to setting up and waiting on multiple indexes and collections. While working on the above issue, I also noticed test failures and repeated issues due to `findIndexWithDimensionAndType`, and tests running with indexes created by and belonging to other test runs. The `isIndexReady` function also tends to be inefficient as the polling isn't consistent, and we can end up waiting sometimes minutes longer than we may need to. With the new `Pinecone` and `Index`/`AsyncIndex` class structure in place, we can refactor our integration tests around the new pattern. ## Solution Ultimately, I would like to move our tests away from the `createIndexIfNotExistsDataPlane` / `createIndexIfNotExistsControlPlane` functions and their reliance on the `findIndexWithDimensionAndType` function, which should be deprecated. We should come up with a pattern for sharing resources across all tests in a run, and setting up / tearing those down once. This can be worked on in subsequent PRs. - Move `CollectionErrorTests` tests into `CollectionTest`. Share index and collection setup `@BeforeAll` across collections tests. Wait less time for the indexes created from the collection to be ready as this specifically can take a number of minutes. Relax some of the assertions on the `status` of created collections and indexes as I don't think we need to be that thorough here. - Update `createIndexIfNotExistsDataPlane` to return a tuple (`AbstractMap.SimpleEntry<String, Pinecone>`) of the `indexName` and the `Pinecone` client to make things a bit easier to work with. - Update all of the `dataPlane/` tests to use `pineconeClient.createIndexConnection()` and `pineconeClient.createAsyncIndexConnection()` for managing data plane operations. - Deprecate `isIndexReady` in lieu of `waitUntilIndexIsReady`. The polling is more consistent with this function, and it most likely saves us time overall. - Refactor `ConfigureIndexTest` to create its own index and clean up after. - Fix issue with `findIndexByDimensionAndType` calling `isIndexReady()` on each index while iterating through the list. - Fix issue in `IndexInterface.validateUpsertRequest` where we were trying to call `sparseValuesWithUnsignedIndices.getIndicesWithUnsigned32IntList()` and `sparseValuesWithUnsignedIndices.getValuesList()` on a possible `null` causing a `NullPointerException`. - Talked with @ssmith-pc, and I think it's standard practice in tests to not `try`/`catch` yourself unless you need to assert on the result. We should be letting errors throw to the test runner and let it handle them so we're not clobbering logs and stack traces. I've cleaned up `try`/`catch` statements which don't seem to be needed. - Running `gradle integrationTest --info` in `pr.yml` to get more detailed log output in the console for better troubleshooting of ongoing flapping. - Adding `assertWithRetry` wrappers for specific actions which have been troublesome. Adding `Thread.sleep()` to a few places to avoid hammering an index too quickly / etc. - Added a new `describeIndexStats()` overload to `IndexInterface` and `Index` / `AsyncIndex` to allow calling without needing to explicitly pass `null`. I spent a lot of time running these locally and in CI to see how they perform. Overall, it seems like these changes improve overall reliability, although we still do see a few failures like the gRPC `no healthy upstream` on data plane operations with fresh indexes. The total amount of time it takes to run both sets of integration tests has been cut significantly in most cases: <img width="1130" alt="Screenshot 2024-03-13 at 6 09 36 PM" src="https://github.com/pinecone-io/pinecone-java-client/assets/119623786/90f46e4a-4dfd-4638-9c17-4f6bfa9b9432"> Next steps would be to create a Junit extension and possible `IndexManagerSingleton` to manage index resources across all tests directly. This would help make things more predictable and reliable when adding future tests. This would also allow us to handle concurrent integration test runs under the same API key, which is currently very difficult due to `findIndexByDimensionAndType`. ## Type of Change - [X] Bug fix (non-breaking change which fixes an issue) - [X] Infrastructure change (CI configs, etc) ## Test Plan Run integration tests and compare over run length and
1 parent 7536ac0 commit c79c2ee

16 files changed

+558
-648
lines changed

.github/workflows/pr.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ jobs:
8080
run: gradle clean build
8181

8282
- name: Run integration tests
83-
run: gradle integrationTest
83+
run: gradle integrationTest --info
8484
env:
8585
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
8686
PINECONE_ENVIRONMENT: ${{ secrets.PINECONE_ENVIRONMENT }}

src/integration/java/io/pinecone/helpers/BuildUpsertRequest.java

+5-10
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import io.pinecone.proto.SparseValues;
77
import io.pinecone.proto.UpsertRequest;
88
import io.pinecone.proto.Vector;
9+
import io.pinecone.unsigned_indices_model.VectorWithUnsignedIndices;
910

1011
import java.util.*;
1112

@@ -135,21 +136,15 @@ public static UpsertRequest buildRequiredUpsertRequest(List<String> upsertIds, S
135136
.build();
136137
}
137138

138-
public static UpsertRequest buildRequiredUpsertRequestByDimension(List<String> upsertIds, int dimension, String namespace) {
139+
public static List<VectorWithUnsignedIndices> buildRequiredUpsertRequestByDimension(List<String> upsertIds, int dimension) {
139140
if (upsertIds.isEmpty()) upsertIds = Arrays.asList("v1", "v2", "v3");
140141

141-
List<Vector> upsertVectors = new ArrayList<>();
142+
List<VectorWithUnsignedIndices> upsertVectors = new ArrayList<>();
142143
for (String upsertId : upsertIds) {
143-
upsertVectors.add(Vector.newBuilder()
144-
.addAllValues(generateVectorValuesByDimension(dimension))
145-
.setId(upsertId)
146-
.build());
144+
upsertVectors.add(new VectorWithUnsignedIndices(upsertId, generateVectorValuesByDimension(dimension)));
147145
}
148146

149-
return UpsertRequest.newBuilder()
150-
.addAllVectors(upsertVectors)
151-
.setNamespace(namespace)
152-
.build();
147+
return upsertVectors;
153148
}
154149

155150
public static UpsertRequest buildOptionalUpsertRequest() {

src/integration/java/io/pinecone/helpers/IndexManager.java

+40-53
Original file line numberDiff line numberDiff line change
@@ -2,70 +2,50 @@
22

33
import io.pinecone.clients.Index;
44
import io.pinecone.clients.Pinecone;
5-
import io.pinecone.configs.PineconeConfig;
6-
import io.pinecone.configs.PineconeConnection;
75
import io.pinecone.exceptions.PineconeException;
86
import org.openapitools.client.model.*;
97
import org.slf4j.Logger;
108
import org.slf4j.LoggerFactory;
119

1210
import java.io.IOException;
11+
import java.util.AbstractMap;
1312
import java.util.List;
1413

15-
import static io.pinecone.helpers.AssertRetry.assertWithRetry;
1614
import static org.junit.jupiter.api.Assertions.assertEquals;
1715
import static org.junit.jupiter.api.Assertions.fail;
1816

1917
public class IndexManager {
2018
private static final Logger logger = LoggerFactory.getLogger(IndexManager.class);
2119

22-
public static PineconeConnection createIndexIfNotExistsDataPlane(int dimension, String indexType) throws IOException, InterruptedException {
20+
public static AbstractMap.SimpleEntry<String, Pinecone> createIndexIfNotExistsDataPlane(int dimension, String indexType) throws IOException, InterruptedException {
2321
String apiKey = System.getenv("PINECONE_API_KEY");
2422
Pinecone pinecone = new Pinecone(apiKey);
2523

2624
String indexName = findIndexWithDimensionAndType(pinecone, dimension, indexType);
27-
if (indexName.isEmpty()) indexName = createNewIndex(pinecone, dimension, indexType);
25+
if (indexName.isEmpty()) indexName = createNewIndex(pinecone, dimension, indexType, true);
2826

29-
// Do not proceed until the newly created index is ready
30-
isIndexReady(indexName, pinecone);
31-
32-
// Adding to test PineconeConnection(pineconeConfig, host) constructor
33-
String host = pinecone.describeIndex(indexName).getHost();
34-
PineconeConfig config = new PineconeConfig(apiKey);
35-
config.setHost(host);
36-
return new PineconeConnection(config);
37-
}
38-
39-
public static String createIndexIfNotExistsControlPlane(Pinecone pinecone, int dimension, String indexType) throws IOException, InterruptedException {
40-
String indexName = findIndexWithDimensionAndType(pinecone, dimension, indexType);
41-
42-
return (indexName.isEmpty()) ? createNewIndex(pinecone, dimension, indexType) : indexName;
27+
return new AbstractMap.SimpleEntry<>(indexName, pinecone);
4328
}
4429

45-
public static String findIndexWithDimensionAndType(Pinecone pinecone, int dimension, String indexType)
46-
throws InterruptedException {
30+
public static String findIndexWithDimensionAndType(Pinecone pinecone, int dimension, String indexType) {
4731
String indexName = "";
48-
int i = 0;
4932
List<IndexModel> indexModels = pinecone.listIndexes().getIndexes();
5033
if(indexModels == null) {
5134
return indexName;
5235
}
53-
while (i < indexModels.size()) {
54-
IndexModel indexModel = isIndexReady(indexModels.get(i).getName(), pinecone);
36+
37+
for (IndexModel indexModel : indexModels) {
5538
if (indexModel.getDimension() == dimension
56-
&& ((indexType.equalsIgnoreCase(IndexModelSpec.SERIALIZED_NAME_POD)
57-
&& indexModel.getSpec().getPod() != null
58-
&& indexModel.getSpec().getPod().getReplicas() == 1
59-
&& indexModel.getSpec().getPod().getPodType().equalsIgnoreCase("p1.x1"))
60-
|| (indexType.equalsIgnoreCase(IndexModelSpec.SERIALIZED_NAME_SERVERLESS)))) {
39+
&& (indexType.equalsIgnoreCase(IndexModelSpec.SERIALIZED_NAME_POD) && indexModel.getSpec().getPod() != null)
40+
|| (indexType.equalsIgnoreCase(IndexModelSpec.SERIALIZED_NAME_SERVERLESS) && indexModel.getSpec().getServerless() != null)
41+
) {
6142
return indexModel.getName();
6243
}
63-
i++;
6444
}
6545
return indexName;
6646
}
6747

68-
public static String createNewIndex(Pinecone pinecone, int dimension, String indexType) {
48+
public static String createNewIndex(Pinecone pinecone, int dimension, String indexType, boolean waitUntilIndexIsReady) throws InterruptedException {
6949
String indexName = RandomStringBuilder.build("index-name", 8);
7050
String environment = System.getenv("PINECONE_ENVIRONMENT");
7151
CreateIndexRequestSpec createIndexRequestSpec;
@@ -74,7 +54,9 @@ public static String createNewIndex(Pinecone pinecone, int dimension, String ind
7454
CreateIndexRequestSpecPod podSpec = new CreateIndexRequestSpecPod().environment(environment).podType("p1.x1");
7555
createIndexRequestSpec = new CreateIndexRequestSpec().pod(podSpec);
7656
} else {
77-
ServerlessSpec serverlessSpec = new ServerlessSpec().cloud(ServerlessSpec.CloudEnum.AWS).region(environment);
57+
// Serverless currently has limited availability in specific regions, hardcode us-west-2 for now
58+
ServerlessSpec serverlessSpec =
59+
new ServerlessSpec().cloud(ServerlessSpec.CloudEnum.AWS).region("us-west-2");
7860
createIndexRequestSpec = new CreateIndexRequestSpec().serverless(serverlessSpec);
7961
}
8062

@@ -85,46 +67,61 @@ public static String createNewIndex(Pinecone pinecone, int dimension, String ind
8567
.spec(createIndexRequestSpec);
8668
pinecone.createIndex(createIndexRequest);
8769

70+
if (waitUntilIndexIsReady) {
71+
waitUntilIndexIsReady(pinecone, indexName);
72+
}
73+
8874
return indexName;
8975
}
9076

9177
public static IndexModel waitUntilIndexIsReady(Pinecone pinecone, String indexName, Integer totalMsToWait) throws InterruptedException {
9278
IndexModel index = pinecone.describeIndex(indexName);
9379
int waitedTimeMs = 0;
94-
int intervalMs = 1500;
80+
int intervalMs = 2000;
9581

9682
while (!index.getStatus().getReady()) {
9783
index = pinecone.describeIndex(indexName);
9884
if (waitedTimeMs >= totalMsToWait) {
99-
logger.info("Index " + indexName + " not ready after " + waitedTimeMs + "ms");
85+
logger.info("WARNING: Index " + indexName + " not ready after " + waitedTimeMs + "ms");
10086
break;
10187
}
10288
if (index.getStatus().getReady()) {
10389
logger.info("Index " + indexName + " is ready after " + waitedTimeMs + "ms");
90+
// Wait one final time before we start connecting and operating on the index
91+
Thread.sleep(10000);
10492
break;
10593
}
10694
Thread.sleep(intervalMs);
95+
logger.info("Waited " + waitedTimeMs + "ms for " + indexName + " to get ready");
10796
waitedTimeMs += intervalMs;
10897
}
10998
return index;
11099
}
111100

112101
public static IndexModel waitUntilIndexIsReady(Pinecone pinecone, String indexName) throws InterruptedException {
113-
return waitUntilIndexIsReady(pinecone, indexName, 120000);
102+
return waitUntilIndexIsReady(pinecone, indexName, 200000);
114103
}
115104

116-
public static PineconeConnection createNewIndexAndConnect(Pinecone pinecone, String indexName, int dimension, IndexMetric metric, CreateIndexRequestSpec spec) throws InterruptedException, PineconeException {
117-
String apiKey = System.getenv("PINECONE_API_KEY");
105+
public static Pinecone createNewIndex(Pinecone pinecone, String indexName, int dimension, IndexMetric metric, CreateIndexRequestSpec spec, boolean waitUntilIndexIsReady) throws InterruptedException, PineconeException {
106+
CreateIndexRequest createIndexRequest = new CreateIndexRequest().name(indexName).dimension(dimension).metric(metric).spec(spec);
107+
pinecone.createIndex(createIndexRequest);
108+
109+
if (waitUntilIndexIsReady) {
110+
waitUntilIndexIsReady(pinecone, indexName);
111+
}
112+
return pinecone;
113+
}
114+
115+
public static Index createNewIndexAndConnect(Pinecone pinecone, String indexName, int dimension, IndexMetric metric, CreateIndexRequestSpec spec) throws InterruptedException, PineconeException {
118116
CreateIndexRequest createIndexRequest = new CreateIndexRequest().name(indexName).dimension(dimension).metric(metric).spec(spec);
119117
pinecone.createIndex(createIndexRequest);
120118

121119
// Wait until index is ready
122-
waitUntilIndexIsReady(pinecone, indexName, 200000);
120+
waitUntilIndexIsReady(pinecone, indexName);
123121
// wait a bit more before we connect...
124-
Thread.sleep(15000);
122+
Thread.sleep(5000);
125123

126-
PineconeConfig config = new PineconeConfig(apiKey);
127-
return new PineconeConnection(config, indexName);
124+
return pinecone.createIndexConnection(indexName);
128125
}
129126

130127
public static CollectionModel createCollection(Pinecone pinecone, String collectionName, String indexName, boolean waitUntilReady) throws InterruptedException {
@@ -138,7 +135,8 @@ public static CollectionModel createCollection(Pinecone pinecone, String collect
138135
int timeWaited = 0;
139136
CollectionModel.StatusEnum collectionReady = collection.getStatus();
140137
while (collectionReady != CollectionModel.StatusEnum.READY && timeWaited < 120000) {
141-
logger.info("Waiting for collection " + collectionName + " to be ready. Waited " + timeWaited + " milliseconds...");
138+
logger.info("Waiting for collection " + collectionName + " to be ready. Waited " + timeWaited + " " +
139+
"milliseconds...");
142140
Thread.sleep(5000);
143141
timeWaited += 5000;
144142
collection = pinecone.describeCollection(collectionName);
@@ -152,15 +150,4 @@ public static CollectionModel createCollection(Pinecone pinecone, String collect
152150

153151
return collection;
154152
}
155-
156-
public static IndexModel isIndexReady(String indexName, Pinecone pinecone)
157-
throws InterruptedException {
158-
final IndexModel[] indexModels = new IndexModel[1];
159-
assertWithRetry(() -> {
160-
indexModels[0] = pinecone.describeIndex(indexName);
161-
assert (indexModels[0].getStatus().getReady());
162-
}, 4);
163-
164-
return indexModels[0];
165-
}
166153
}

0 commit comments

Comments
 (0)