-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Overview
The Admin API generations list endpoint is slow in production because it returns all generations in a single request.
Observed latency is ~4.5s for ~263 generations.
In addition, there are two other admin APIs that are also slow, specifically /runs and /prompts
Reproduction
time curl -X GET "https://admin-api.mcbench.ai/api/generation" \
-H "Authorization: Bearer YOUR_TOKEN_HERE" \
-H "Content-Type: application/json"To obtain the bearer, go to mcbench.ai, to go chrome dev tools, run localStorage.getItem('token') and copy the JWT.
You'll get a response in around 4.5 seconds.
You can also see this visually by navigating to https://mcbench.ai/generations
Proposed direction
I haven't verified it 100% but it looks like an issue with N+1 queries
generations = db.scalars(
select(Generation).order_by(Generation.created.desc())
).all()
payload = {
"data": [
generation.to_dict(include_runs=False, include_stats=True)
for generation in generations
],
"total": len(generations),
}
That list comprehension is N+1
Maybe we need to create some joins / groups by or an index too to make the join a bit quicker?
Below was a suggestion from Claude.
if generation_ids:
run_counts = db.execute(
select(
schema.specification.run.c.generation_id,
func.count(1).label("run_count"),
)
.where(schema.specification.run.c.generation_id.in_(generation_ids))
.group_by(schema.specification.run.c.generation_id)
).all()
Not a pro at system design