Skip to content

Very slow deletion performance of removeMany with HNSW vector index #710

Open
@Ohrest88

Description

@Ohrest88

First of all, thank you for this great project!
I have searched the issues but I couldn't find any issue related to this.

Description

In my Flutter application, I have an objectbox entity defined as:

@Entity()
class DocumentSection {
  @Id()
  int id = 0;

  final document = ToOne<Document>();
  String content;
  
  @Property(type: PropertyType.int)
  int pageNumber;
  
  @HnswIndex(
    dimensions: 500,
    distanceType: VectorDistanceType.cosine
  )
  @Property(type: PropertyType.floatVector)
  List<double>? embedding;

  @Property(type: PropertyType.int)
  int originalId = 0;

  DocumentSection({
    this.content = '',
    this.embedding,
    this.pageNumber = 0,
  });
}

It's for a semantic search use-case. The objectbox DB has 109000 entries for DocumentSection (therefore 109000 vectors).

While the performance of vector search is remarkably fast with that number of vectors (For example less than 1 second for nearestNeighborsF32 to return a result with 20 nearest embeddings), deleting entries is very slow:

Taking about 264 seconds (4.4 minutes) to delete 22,085 entries (out of 109000 total entries).
Could the reason for this be related to the management of the HNSW vector index during the removeMany operation?

This is the code I'm using to delete entries:

  void _deleteDocument(Document document) {
    try {
      debugPrint('Starting deletion of document: ${document.filename} (ID: ${document.id})');
      final startTime = DateTime.now();

      widget.store.runInTransaction(TxMode.write, () {
        debugPrint('Starting transaction...');
        
        // Query sections
        debugPrint('Querying sections...');
        final queryStart = DateTime.now();
        final query = widget.sectionBox
            .query(DocumentSection_.document.equals(document.id))
            .build();
            
        final sectionCount = query.count();
        final queryDuration = DateTime.now().difference(queryStart);
        debugPrint('Found $sectionCount sections to delete (query took ${queryDuration.inMilliseconds}ms)');

        // Get IDs
        debugPrint('Getting section IDs...');
        final getIdsStart = DateTime.now();
        final ids = query.findIds();
        final getIdsDuration = DateTime.now().difference(getIdsStart);
        debugPrint('Got ${ids.length} section IDs (took ${getIdsDuration.inMilliseconds}ms)');
        
        query.close();

        // Delete sections using removeMany
        debugPrint('Starting batch section deletion...');
        final deleteStart = DateTime.now();
        final removedCount = widget.sectionBox.removeMany(ids);
        final deleteDuration = DateTime.now().difference(deleteStart);
        debugPrint('Sections deleted: $removedCount (took ${deleteDuration.inMilliseconds}ms)');
        
        // Delete document
        debugPrint('Deleting document...');
        final docDeleteStart = DateTime.now();
        widget.documentBox.remove(document.id);
        final docDeleteDuration = DateTime.now().difference(docDeleteStart);
        debugPrint('Document deleted (took ${docDeleteDuration.inMilliseconds}ms)');
      });
      
      final totalDuration = DateTime.now().difference(startTime);
      debugPrint('Total deletion process took ${totalDuration.inMilliseconds}ms');

      ScaffoldMessenger.of(context).showSnackBar(
        SnackBar(content: Text('Deleted ${document.filename}')),
      );
      
      // Refresh the data
      _loadData();
    } catch (e) {
      debugPrint('Error deleting document: $e');
      ScaffoldMessenger.of(context).showSnackBar(
        SnackBar(content: Text('Error deleting document: $e')),
      );
    }
  }

The above code produces these logs:

flutter: Starting deletion of document: Test.pdf (ID: 23)
flutter: Starting transaction...
flutter: Querying sections...
flutter: Found 22085 sections to delete (query took 16ms)
flutter: Getting section IDs...
flutter: Got 22085 section IDs (took 2ms)
flutter: Starting batch section deletion...
flutter: Sections deleted: 22085 (took 264141ms)
flutter: Deleting document...
flutter: Document deleted (took 0ms)
flutter: Total deletion process took 264192ms

Specifically, this line appears to be the bottleneck:

final removedCount = widget.sectionBox.removeMany(ids);

Could the slowdown be related to the HNSW index maintenance during deletion, as all other operations (querying, getting IDs) are very fast?
Is there a known solution for this issue?

Environment:

ObjectBox version: 4.1.0
Flutter: 3.29.0
Platform tested on: Linux (Ubuntu 24.04.1 LTS)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions