Skip to content

add graph-based index merging interface #318

Open
@inabao

Description

  • algorithm
  • perf ref

Background
Currently, the cost of building graphs is very high, and to ensure that the insertion latency and fault recovery are acceptable, we typically want to keep the indexes that support real-time writes not too large. This leads to the need to frequently generate new small indexes, and we need to periodically rebuild indexes in the background, which significantly increases the indexing overhead.

Describe the solution you'd like
We hope to add an index merging feature to provide fast merging capabilities for indexes. At the same time, the cost of merging indexes (reusing the original graph structure) is much less than rebuilding the index.

Interface
Here is a simple interface that takes a new index, parameters, and a list of indexes to be merged:

    // Engine
    /**
     * @brief Merges multiple graph indexes into a single index.
     *
     * This function takes a specified `name` and `parameters`, along with a vector of
     * `sub_indexes`, and attempts to merge them into a single graph index. The result is
     * either a shared pointer to the newly merged `Index` or an `Error` object that describes
     * any failures encountered during the merging process.
     *
     * @param name The name assigned to the merged graph index, which will represent the
     *              combined structure of the provided sub-indexes.
     * @param parameters A JSON-like string containing various parameters that dictate 
     *                   the merging behavior and properties of the resulting index.
     * @param sub_indexes A vector of shared pointers to the `Index` objects that are to be merged.
     * @return tl::expected<std::shared_ptr<Index>, Error> An expected value that contains either
     * a shared pointer to the successfully merged `Index` or an `Error` detailing
     * why the merge operation failed.
     */
    tl::expected<std::shared_ptr<Index>, Error>
    MergeGraphIndex(const std::string& name,
                    const std::string& parameters,
                    const std::vector<std::shared_ptr<Index>>& sub_indexes);

Alternatively, we can implement an interface on the Index.

    // Index
    /**
     * @brief Merges multiple graph indexes into a single index.
     *
     * This function takes a vector of `sub_indexes`, and attempts to merge them into a single 
     * graph index. The result is either a shared pointer to the newly merged `Index` or an `Error` 
     * object that describes any failures encountered during the merging process.
     * 
     * @param sub_indexes A vector of shared pointers to the `Index` objects that are to be merged.
     * @return tl::expected<std::shared_ptr<Index>, Error> An expected value that contains either
     * a shared pointer to the successfully merged `Index` or an `Error` detailing
     * why the merge operation failed.
     */
    tl::expected<std::shared_ptr<Index>, Error>
    Merge(const std::vector<std::shared_ptr<Index>>& sub_indexes);

Implementation
The specific implementation will be updated later.

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions