@@ -167,6 +167,42 @@ _Click each to expand._
167
167
168
168
</details >
169
169
170
+ <details >
171
+ <summary ><b >Multi-vectors</b ></summary >
172
+
173
+ ``` python
174
+ < pyspark.sql.DataFrame>
175
+ .write
176
+ .format(" io.qdrant.spark.Qdrant" )
177
+ .option(" qdrant_url" , " <QDRANT_GRPC_URL>" )
178
+ .option(" collection_name" , " <QDRANT_COLLECTION_NAME>" )
179
+ .option(" multi_vector_fields" , " <COLUMN_NAME>" )
180
+ .option(" multi_vector_names" , " <MULTI_VECTOR_NAME>" )
181
+ .option(" schema" , < pyspark.sql.DataFrame> .schema.json())
182
+ .mode(" append" )
183
+ .save()
184
+ ```
185
+
186
+ </details >
187
+
188
+ <details >
189
+ <summary ><b >Multiple Multi-vectors</b ></summary >
190
+
191
+ ``` python
192
+ < pyspark.sql.DataFrame>
193
+ .write
194
+ .format(" io.qdrant.spark.Qdrant" )
195
+ .option(" qdrant_url" , " <QDRANT_GRPC_URL>" )
196
+ .option(" collection_name" , " <QDRANT_COLLECTION_NAME>" )
197
+ .option(" multi_vector_fields" , " <COLUMN_NAME>,<ANOTHER_COLUMN_NAME>" )
198
+ .option(" multi_vector_names" , " <MULTI_VECTOR_NAME>,<ANOTHER_MULTI_VECTOR_NAME>" )
199
+ .option(" schema" , < pyspark.sql.DataFrame> .schema.json())
200
+ .mode(" append" )
201
+ .save()
202
+ ```
203
+
204
+ </details >
205
+
170
206
<details >
171
207
<summary ><b >No vectors - Entire dataframe is stored as payload</b ></summary >
172
208
@@ -202,23 +238,25 @@ The appropriate Spark data types are mapped to the Qdrant payload based on the p
202
238
203
239
## Options and Spark types
204
240
205
- | Option | Description | Column DataType | Required |
206
- | :--------------------------- | :----------------------------------------------------------------------------------- | :---------------------------- | :------- |
207
- | ` qdrant_url ` | GRPC URL of the Qdrant instance. Eg: < http://localhost:6334 > | - | ✅ |
208
- | ` collection_name ` | Name of the collection to write data into | - | ✅ |
209
- | ` schema ` | JSON string of the dataframe schema | - | ✅ |
210
- | ` embedding_field ` | Name of the column holding the embeddings (Deprecated - Use ` vector_fields ` instead) | ` ArrayType(FloatType) ` | ❌ |
211
- | ` id_field ` | Name of the column holding the point IDs. Default: Random UUID | ` StringType ` or ` IntegerType ` | ❌ |
212
- | ` batch_size ` | Max size of the upload batch. Default: 64 | - | ❌ |
213
- | ` retries ` | Number of upload retries. Default: 3 | - | ❌ |
214
- | ` api_key ` | Qdrant API key for authentication | - | ❌ |
215
- | ` vector_name ` | Name of the vector in the collection. | - | ❌ |
216
- | ` vector_fields ` | Comma-separated names of columns holding the vectors. | ` ArrayType(FloatType) ` | ❌ |
217
- | ` vector_names ` | Comma-separated names of vectors in the collection. | - | ❌ |
218
- | ` sparse_vector_index_fields ` | Comma-separated names of columns holding the sparse vector indices. | ` ArrayType(IntegerType) ` | ❌ |
219
- | ` sparse_vector_value_fields ` | Comma-separated names of columns holding the sparse vector values. | ` ArrayType(FloatType) ` | ❌ |
220
- | ` sparse_vector_names ` | Comma-separated names of the sparse vectors in the collection. | - | ❌ |
221
- | ` shard_key_selector ` | Comma-separated names of custom shard keys to use during upsert. | - | ❌ |
241
+ | Option | Description | Column DataType | Required |
242
+ | :--------------------------- | :----------------------------------------------------------------------------------- | :-------------------------------- | :------- |
243
+ | ` qdrant_url ` | gRPC URL of the Qdrant instance. Eg: < http://localhost:6334 > | - | ✅ |
244
+ | ` collection_name ` | Name of the collection to write data into | - | ✅ |
245
+ | ` schema ` | JSON string of the dataframe schema | - | ✅ |
246
+ | ` embedding_field ` | Name of the column holding the embeddings (Deprecated - Use ` vector_fields ` instead) | ` ArrayType(FloatType) ` | ❌ |
247
+ | ` id_field ` | Name of the column holding the point IDs. Default: Random UUID | ` StringType ` or ` IntegerType ` | ❌ |
248
+ | ` batch_size ` | Max size of the upload batch. Default: 64 | - | ❌ |
249
+ | ` retries ` | Number of upload retries. Default: 3 | - | ❌ |
250
+ | ` api_key ` | Qdrant API key for authentication | - | ❌ |
251
+ | ` vector_name ` | Name of the vector in the collection. | - | ❌ |
252
+ | ` vector_fields ` | Comma-separated names of columns holding the vectors. | ` ArrayType(FloatType) ` | ❌ |
253
+ | ` vector_names ` | Comma-separated names of vectors in the collection. | - | ❌ |
254
+ | ` sparse_vector_index_fields ` | Comma-separated names of columns holding the sparse vector indices. | ` ArrayType(IntegerType) ` | ❌ |
255
+ | ` sparse_vector_value_fields ` | Comma-separated names of columns holding the sparse vector values. | ` ArrayType(FloatType) ` | ❌ |
256
+ | ` sparse_vector_names ` | Comma-separated names of the sparse vectors in the collection. | - | ❌ |
257
+ | ` multi_vector_fields ` | Comma-separated names of columns holding the multi-vector values. | ` ArrayType(ArrayType(FloatType)) ` | ❌ |
258
+ | ` multi_vector_names ` | Comma-separated names of the multi-vectors in the collection. | - | ❌ |
259
+ | ` shard_key_selector ` | Comma-separated names of custom shard keys to use during upsert. | - | ❌ |
222
260
223
261
## LICENSE
224
262
0 commit comments