Skip to content

Conversation

@Fujio-Turner
Copy link

Summary

Add support for vector search indexes in Couchbase Lite React Native.

New Files

  • cblite/src/vector-index.ts - Vector index classes and enums

Features

  • VectorIndex class for QueryBuilder-style index creation
  • VectorIndexConfiguration for configuration-based index creation
  • DistanceMetric enum (cosine, euclidean, euclideanSquared, dot)
  • VectorEncoding class (none, scalarQuantizer, productQuantizer)
  • ScalarQuantizerType enum (SQ4, SQ6, SQ8)

Usage

import { VectorIndexConfiguration, DistanceMetric } from 'cbl-reactnative';

const config = new VectorIndexConfiguration('embedding', 512, 100);
config.metric = DistanceMetric.COSINE;
await collection.createIndex('embedding_idx', config);

This enables APPROX_VECTOR_DISTANCE() queries for similarity search.

Related: Couchbase-Ecosystem/cbl-reactnative#86

Add support for vector search indexes in Couchbase Lite:
- VectorIndex class for QueryBuilder-style index creation
- VectorIndexConfiguration for configuration-based index creation
- DistanceMetric enum (cosine, euclidean, euclideanSquared, dot)
- VectorEncoding class (none, scalarQuantizer, productQuantizer)
- ScalarQuantizerType enum (SQ4, SQ6, SQ8)

This enables APPROX_VECTOR_DISTANCE() queries for similarity search.

Ref: Couchbase-Ecosystem/cbl-reactnative#86
@gemini-code-assist
Copy link

Summary of Changes

Hello @Fujio-Turner, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Couchbase Lite React Native by integrating robust vector search capabilities. It introduces new classes and enums that empower developers to define and manage vector indexes with fine-grained control over distance metrics, vector encoding, and other crucial parameters. This addition facilitates efficient approximate nearest neighbor queries, opening up possibilities for advanced similarity search functionalities within applications.

Highlights

  • Vector Search Support: Introduces full support for vector search indexes in Couchbase Lite React Native, enabling approximate nearest neighbor (ANN) queries.
  • New Indexing Classes: Adds VectorIndex for QueryBuilder-style index creation and VectorIndexConfiguration for configuration-based index definition.
  • Distance Metrics: Includes a DistanceMetric enum with options like Euclidean, Squared Euclidean, Cosine, and Dot product for similarity calculations.
  • Vector Encoding and Compression: Provides a VectorEncoding class and ScalarQuantizerType enum to support various vector compression techniques, including scalar and product quantization.
  • SQL++ Functionality: Enables the use of the APPROX_VECTOR_DISTANCE() SQL++ function for performing similarity searches.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for vector search indexes by adding VectorIndex and VectorIndexConfiguration. The implementation is well-structured, but there are a few critical issues to address. The scalar quantization logic is currently non-functional as it ignores the quantization level parameter. There is also significant code duplication between VectorIndex and VectorIndexConfiguration which has led to inconsistencies like missing validation. Finally, a magic number is used for the index type instead of updating the IndexType enum. Addressing these points will improve the correctness and maintainability of the new feature.

Comment on lines +82 to +84
static scalarQuantizer(type: ScalarQuantizerType = ScalarQuantizerType.SQ8): VectorEncoding {
return new VectorEncoding('SQ', undefined, undefined);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The type parameter is unused. This means the specific scalar quantization level (SQ4, SQ6, SQ8) is lost, and a generic 'SQ' type is used instead. This prevents users from properly configuring the index encoding. This method should use the type parameter to create the VectorEncoding. Note that this change will also require an update to the toJson() method to handle the new types.

  static scalarQuantizer(type: ScalarQuantizerType = ScalarQuantizerType.SQ8): VectorEncoding {
    if (type === ScalarQuantizerType.NONE) {
      return VectorEncoding.none();
    }
    return new VectorEncoding(type);
  }

Comment on lines +97 to +109
toJson(): any {
if (this._type === 'none') {
return { type: 'none' };
} else if (this._type === 'SQ') {
return { type: 'SQ' };
} else {
return {
type: 'PQ',
bits: this._bits,
subquantizers: this._subquantizers,
};
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Following the required change in scalarQuantizer, this toJson method will fail because this._type will be SQ4, SQ6, or SQ8, not 'SQ'. The method doesn't correctly serialize the specific scalar quantization type and needs to be updated to handle these specific types.

  toJson(): any {
    if (this._type === 'none') {
      return { type: 'none' };
    } else if (this._type.startsWith('SQ')) {
      return { type: this._type };
    } else {
      return {
        type: 'PQ',
        bits: this._bits,
        subquantizers: this._subquantizers,
      };
    }
  }

Comment on lines +302 to +305
type(): IndexType {
// Vector = 3 in IndexType enum
return 3 as IndexType;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The type() method returns a magic number 3 cast to IndexType. The comment indicates this should correspond to a Vector index type, but the IndexType enum in abstract-index.ts has not been updated to include a Vector member. Using a magic number is brittle and defeats the purpose of the enum. The IndexType enum should be updated to include Vector = 3. While abstract-index.ts may not be part of this PR, this is an important issue to address for code correctness and maintainability.

Comment on lines +44 to +53
export enum ScalarQuantizerType {
/** No quantization - full precision */
NONE = 'none',
/** 4-bit quantization */
SQ4 = 'SQ4',
/** 6-bit quantization */
SQ6 = 'SQ6',
/** 8-bit quantization (recommended) */
SQ8 = 'SQ8',
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ScalarQuantizerType enum includes a NONE member. This is confusing because there is also a VectorEncoding.none() static method, suggesting two ways to specify no encoding. To improve API clarity, consider removing NONE from this enum and guiding users to use VectorEncoding.none() exclusively for uncompressed vectors.

* await collection.createIndex('my_vector_idx', index);
* ```
*/
export class VectorIndex extends AbstractIndex {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The VectorIndex class duplicates a significant amount of code from VectorIndexConfiguration, including properties, default values, and the toJson() method. This duplication can lead to maintenance issues. For instance, the constructor of VectorIndexConfiguration validates dimensions and centroids, but this validation is missing in the VectorIndex constructor. To improve maintainability and ensure consistency, consider refactoring to eliminate this duplication. One approach is to use composition by creating a shared helper class that holds the common configuration, validation, and serialization logic.

- Add vector-index.ts export to index.ts
- Add Vector enum to IndexType
- Add vectorIndex() method to IndexBuilder
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant