Bug Report: asyncio.run() incompatibility in Flask/threaded environments (Python 3.13)
Package: reme
Observed version: 0.3.0.5 (or latest — checking pip show reme)
Python version: 3.13.x
Environment: Flask threaded web server (Werkzeug), macOS
Summary
When ReMe is used as a singleton in a Flask application and asyncio.run() is called per-request to invoke summarize_memory() / retrieve_memory(), approximately 46% of calls fail with either:
RuntimeError: This event loop is already running — raised by asyncio.run() in Flask worker threads under Python 3.13
KeyError: 'default' — in base_op.py:136 at self.service_context.llms[self._llm], indicating the LLM was not found in the service context
Both errors surface as unhandled exceptions from asyncio.run(reme.summarize_memory(...)).
Reproduction Steps
pythonCopyCopied!
from flask import Flask, request, jsonify
import asyncio
import threading
from reme import ReMe
app = Flask(name)
reme_instance = None
lock = threading.Lock()
def get_reme():
global reme_instance
if reme_instance is not None:
return reme_instance
with lock:
if reme_instance is not None:
return reme_instance
reme_instance = ReMe(
working_dir="/tmp/reme-test",
default_llm_config={
"backend": "litellm",
"model_name": "google/gemini-2.0-flash-001",
"api_key": "YOUR_KEY",
"base_url": "https://openrouter.ai/api/v1",
},
enable_logo=False,
log_to_console=False,
)
asyncio.run(reme_instance.start()) # ← Initializes in event loop A
return reme_instance
@app.route("/summarize", methods=["POST"])
def summarize():
reme = get_reme()
# ← Each request creates event loop B, C, D... in different threads
# ← Fails ~46% of the time with RuntimeError or KeyError: 'default'
raw = asyncio.run(reme.summarize_memory(
messages=request.json["messages"],
task_name="test",
return_dict=True,
))
return jsonify({"ok": True})
if name == "main":
app.run(threaded=True, port=5000)
Send 10 concurrent POST requests. Approximately 4-5 will fail.
Error 1: RuntimeError: This event loop is already running
When: Python 3.13, Flask threaded mode, certain worker threads have pre-existing asyncio state.
CopyCopied!
RuntimeError: This event loop is already running
File "flask_app.py", line N, in summarize
raw = asyncio.run(reme.summarize_memory(...))
asyncio.run() docs say: "If there's a running event loop in the current thread, this function will raise RuntimeError." Python 3.13 changed event loop initialization behavior, making it more likely that threads have a loop in a non-None state.
Error 2: KeyError: 'default'
When: Python 3.13, after the initialization event loop (loop A) is closed by asyncio.run().
Full traceback:
CopyCopied!
File ".../reme/core/op/base_react.py", line 159, in execute
File ".../reme/memory/vector_based/reme_summarizer.py", line 19, in build_messages
File ".../reme/memory/vector_based/base_memory_agent.py", line 44, in author
File ".../reme/core/op/base_op.py", line 136, in llm
self._llm = self.service_context.llms[self._llm]
KeyError: 'default'
start() populates service_context.llms['default'] in event loop A. When loop A is closed by asyncio.run(), subsequent calls from event loop B encounter a state mismatch. The _handle_failure in base_op.py catches this but returns an empty result silently.
Impact
46% of summarize_memory() calls fail in production Flask deployments
_handle_failure catches KeyError: 'default' silently (returns empty result, not an exception) — callers get no indication of failure unless they check the answer field
The remaining ~54% succeed because they happen to avoid the event loop conflict
Workaround
Replace asyncio.run() with a persistent event loop using asyncio.run_coroutine_threadsafe():
pythonCopyCopied!
import asyncio
import threading
_loop = None
_loop_thread = None
_loop_lock = threading.Lock()
def get_loop():
global _loop, _loop_thread
with _loop_lock:
if _loop is None or _loop.is_closed():
_loop = asyncio.new_event_loop()
_loop_thread = threading.Thread(
target=_loop.run_forever, daemon=True
)
_loop_thread.start()
return _loop
def run_async(coro, timeout=60.0):
future = asyncio.run_coroutine_threadsafe(coro, get_loop())
return future.result(timeout=timeout)
Use run_async() instead of asyncio.run() everywhere:
run_async(reme_instance.start())
raw = run_async(reme_instance.summarize_memory(...))
Suggested Fix
ReMe should provide a thread-safe synchronous wrapper for use in non-async (e.g., Flask/Django) environments. Options:
Option A: Built-in sync wrapper methods
pythonCopyCopied!
ReMe adds:
reme.summarize_memory_sync(messages=..., task_name=..., return_dict=True)
reme.retrieve_memory_sync(query=..., return_dict=True)
Internally uses the persistent-loop pattern above.
Option B: Context manager for Flask integration
pythonCopyCopied!
with ReMe.sync_context() as reme:
result = reme.summarize_memory(...) # runs synchronously via persistent loop
Option C: Documentation
At minimum, document the incompatibility with asyncio.run() in threaded WSGI servers and provide the run_coroutine_threadsafe workaround in the README.
Additional Notes
ReMe's _handle_failure silently swallows KeyError: 'default' and returns an empty result. This makes the error extremely hard to diagnose without reading source code. Consider propagating this as a proper exception (e.g., ReMeNotInitializedError) so callers can detect and handle it.
The service_context.llms['default'] dict lookup happens in base_op.py:136 via the lazy llm property. Adding a guard (if not self.service_context.llms: raise RuntimeError("ReMe.start() was not called or failed")) would produce a clear, actionable error message.
Environment
Python: 3.13.x (macOS arm64)
reme: 0.3.0.5
flask: (any threaded mode)
litellm: (any version)
Tested on macOS 25.3.0 (Darwin), Apple Silicon.
Bug Report: asyncio.run() incompatibility in Flask/threaded environments (Python 3.13)
Package: reme
Observed version: 0.3.0.5 (or latest — checking pip show reme)
Python version: 3.13.x
Environment: Flask threaded web server (Werkzeug), macOS
Summary
When ReMe is used as a singleton in a Flask application and asyncio.run() is called per-request to invoke summarize_memory() / retrieve_memory(), approximately 46% of calls fail with either:
RuntimeError: This event loop is already running — raised by asyncio.run() in Flask worker threads under Python 3.13
KeyError: 'default' — in base_op.py:136 at self.service_context.llms[self._llm], indicating the LLM was not found in the service context
Both errors surface as unhandled exceptions from asyncio.run(reme.summarize_memory(...)).
Reproduction Steps
pythonCopyCopied!
from flask import Flask, request, jsonify
import asyncio
import threading
from reme import ReMe
app = Flask(name)
reme_instance = None
lock = threading.Lock()
def get_reme():
global reme_instance
if reme_instance is not None:
return reme_instance
with lock:
if reme_instance is not None:
return reme_instance
reme_instance = ReMe(
working_dir="/tmp/reme-test",
default_llm_config={
"backend": "litellm",
"model_name": "google/gemini-2.0-flash-001",
"api_key": "YOUR_KEY",
"base_url": "https://openrouter.ai/api/v1",
},
enable_logo=False,
log_to_console=False,
)
asyncio.run(reme_instance.start()) # ← Initializes in event loop A
return reme_instance
@app.route("/summarize", methods=["POST"])
def summarize():
reme = get_reme()
# ← Each request creates event loop B, C, D... in different threads
# ← Fails ~46% of the time with RuntimeError or KeyError: 'default'
raw = asyncio.run(reme.summarize_memory(
messages=request.json["messages"],
task_name="test",
return_dict=True,
))
return jsonify({"ok": True})
if name == "main":
app.run(threaded=True, port=5000)
Send 10 concurrent POST requests. Approximately 4-5 will fail.
Error 1: RuntimeError: This event loop is already running
When: Python 3.13, Flask threaded mode, certain worker threads have pre-existing asyncio state.
CopyCopied!
RuntimeError: This event loop is already running
File "flask_app.py", line N, in summarize
raw = asyncio.run(reme.summarize_memory(...))
asyncio.run() docs say: "If there's a running event loop in the current thread, this function will raise RuntimeError." Python 3.13 changed event loop initialization behavior, making it more likely that threads have a loop in a non-None state.
Error 2: KeyError: 'default'
When: Python 3.13, after the initialization event loop (loop A) is closed by asyncio.run().
Full traceback:
CopyCopied!
File ".../reme/core/op/base_react.py", line 159, in execute
File ".../reme/memory/vector_based/reme_summarizer.py", line 19, in build_messages
File ".../reme/memory/vector_based/base_memory_agent.py", line 44, in author
File ".../reme/core/op/base_op.py", line 136, in llm
self._llm = self.service_context.llms[self._llm]
KeyError: 'default'
start() populates service_context.llms['default'] in event loop A. When loop A is closed by asyncio.run(), subsequent calls from event loop B encounter a state mismatch. The _handle_failure in base_op.py catches this but returns an empty result silently.
Impact
46% of summarize_memory() calls fail in production Flask deployments
_handle_failure catches KeyError: 'default' silently (returns empty result, not an exception) — callers get no indication of failure unless they check the answer field
The remaining ~54% succeed because they happen to avoid the event loop conflict
Workaround
Replace asyncio.run() with a persistent event loop using asyncio.run_coroutine_threadsafe():
pythonCopyCopied!
import asyncio
import threading
_loop = None
_loop_thread = None
_loop_lock = threading.Lock()
def get_loop():
global _loop, _loop_thread
with _loop_lock:
if _loop is None or _loop.is_closed():
_loop = asyncio.new_event_loop()
_loop_thread = threading.Thread(
target=_loop.run_forever, daemon=True
)
_loop_thread.start()
return _loop
def run_async(coro, timeout=60.0):
future = asyncio.run_coroutine_threadsafe(coro, get_loop())
return future.result(timeout=timeout)
Use run_async() instead of asyncio.run() everywhere:
run_async(reme_instance.start())
raw = run_async(reme_instance.summarize_memory(...))
Suggested Fix
ReMe should provide a thread-safe synchronous wrapper for use in non-async (e.g., Flask/Django) environments. Options:
Option A: Built-in sync wrapper methods
pythonCopyCopied!
ReMe adds:
reme.summarize_memory_sync(messages=..., task_name=..., return_dict=True)
reme.retrieve_memory_sync(query=..., return_dict=True)
Internally uses the persistent-loop pattern above.
Option B: Context manager for Flask integration
pythonCopyCopied!
with ReMe.sync_context() as reme:
result = reme.summarize_memory(...) # runs synchronously via persistent loop
Option C: Documentation
At minimum, document the incompatibility with asyncio.run() in threaded WSGI servers and provide the run_coroutine_threadsafe workaround in the README.
Additional Notes
ReMe's _handle_failure silently swallows KeyError: 'default' and returns an empty result. This makes the error extremely hard to diagnose without reading source code. Consider propagating this as a proper exception (e.g., ReMeNotInitializedError) so callers can detect and handle it.
The service_context.llms['default'] dict lookup happens in base_op.py:136 via the lazy llm property. Adding a guard (if not self.service_context.llms: raise RuntimeError("ReMe.start() was not called or failed")) would produce a clear, actionable error message.
Environment
Python: 3.13.x (macOS arm64)
reme: 0.3.0.5
flask: (any threaded mode)
litellm: (any version)
Tested on macOS 25.3.0 (Darwin), Apple Silicon.