Bug: concurrent AsyncMemory writes corrupt Qdrant HNSW index ('index N is out of bounds')

## Summary

When multiple coroutines call `AsyncMemory.add()`, `AsyncMemory.update()`, or `AsyncMemory.delete()` concurrently (e.g. an async agent processing events in parallel), the Qdrant HNSW index becomes corrupted, producing:

```
IndexError: index N is out of bounds for axis 0 with size M
```

Subsequent reads (`search()`, `get()`) either return wrong results or raise the same error until the process is restarted and the index is rebuilt.

## Root cause

`AsyncMemory`'s write methods dispatch the synchronous `vector_store.insert/update/delete` calls via `asyncio.to_thread()`. When multiple callers hit these paths concurrently the thread-pool runs several write tasks in parallel.

Qdrant's `upsert` (and the underlying HNSW graph builder) is not re-entrant — it can return before the internal graph construction is complete. A second writer arriving while the first graph update is still in-flight reads a partially-built HNSW structure, which triggers an out-of-bounds index access.

## Reproduction

```python
import asyncio
from mem0 import AsyncMemory

m = AsyncMemory()  # default Qdrant embedded

async def writer(i):
    await m.add(f"fact {i}", user_id="test")

async def main():
    # 20 concurrent writers → reproducible corruption
    await asyncio.gather(*[writer(i) for i in range(20)])
    results = await m.search("fact", filters={"user_id": "test"})
    print(results)

asyncio.run(main())
```

## Fix

Add `self._write_lock = asyncio.Lock()` in `AsyncMemory.__init__` and acquire it around the three vector-store write sites:

- Phase 6 batch insert in `_add_to_vector_store`
- `vector_store.update` in `_update_memory`
- `vector_store.delete` in `_delete_memory`

Reads (`search`, `get`, `get_all`), embeddings, and LLM calls are unaffected — only the actual write to the vector store is serialised.

A PR with this fix is attached.

## Discovery

This bug was identified and the fix developed collaboratively by @MattGyver and **Cipher** (an AI agent running on top of this stack). The corruption was observed in sustained production workloads with 20+ simultaneous `add()` calls over several months before the root cause was pinpointed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: concurrent AsyncMemory writes corrupt Qdrant HNSW index ('index N is out of bounds') #4892

Summary

Root cause

Reproduction

Fix

Discovery

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: concurrent AsyncMemory writes corrupt Qdrant HNSW index ('index N is out of bounds') #4892

Description

Summary

Root cause

Reproduction

Fix

Discovery

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions