Conversation storage: schema extension and per-request scoping

### Problem

The `ConversationStorage` interface currently has a fixed schema (`id`, `created_at`, `metadata`) and no awareness of per-request context. Consumers that need tenant isolation (e.g., scoping conversations to an `organization_id`) are forced into workarounds:

- **Stuffing system fields into metadata** — pollutes the end-user-facing metadata with internal concerns (`organization_id` is not consumer data)
- **Wrapping storage with `AsyncLocalStorage`** — adds complexity and a non-standard pattern
- **Hacking requests in `onRequest`** — reconstructing `Request` objects to inject/strip query params and body fields

The core issue is twofold:
1. **No schema extension** — consumers can't add custom columns to conversation tables
2. **No per-request context in storage** — the storage is a singleton with no access to the request's `state` bag

### Use Case

A multi-tenant platform where each organization has its own conversations stored in the same GreptimeDB instance. Every write must include `organization_id`, every read must filter by it. This isolation must be enforced at the storage boundary — not at the HTTP level — so it's impossible to forget.

Future consumers may need additional scoping fields (e.g., `project_id`, `environment`), so the mechanism should be general-purpose rather than hardcoded to a single field.

---

## Proposed Approaches

Four approaches were evaluated. All share the same `additionalFields` mechanism for schema extension (DDL) but differ in how runtime behavior is handled.

### `additionalFields` (shared across all approaches)

Following better-auth's pattern, `additionalFields` declares the schema — what the field is, not how it's stored. Database-level concerns (primary keys, indexes, partitioning) are handled internally by each dialect's migration logic.

```typescript
additionalFields: {
  conversations: {
    organization_id: {
      type: "string",
      required: true,
    },
  },
}
```

On `migrate()`, the dialect adds the column to the DDL. For GreptimeDB, the dialect can automatically include `required` additional fields in the primary key to optimise partition/search performance — the consumer doesn't need to specify this.

---

### Approach 1: Declarative Fields with `resolve`

Extend `additionalFields` with a `resolve` function that maps `state` to a value. The gateway auto-injects on writes and auto-filters on reads.

```typescript
gateway({
  storage: {
    dialect: GrepTimeDialect(client),
    additionalFields: {
      conversations: {
        organization_id: {
          type: "string",
          required: true,
          resolve: (state) => state.organizationId as string,
        },
      },
    },
  },
});
```

The gateway handles everything internally:
- **DDL**: Adds the column on `migrate()`
- **Writes** (create, update): Calls `resolve(state)` and injects the value
- **Reads** (list, get, delete): Calls `resolve(state)` and adds a `WHERE` clause

**Pros:**
- Simplest consumer API — declare once, isolation is automatic everywhere
- Impossible to forget a filter — enforced on every operation by the gateway
- Schema and binding are co-located in one declaration
- Zero boilerplate

**Cons:**
- Rigid — every field follows the same "inject on write, filter on read" pattern
- No conditional logic per operation (e.g., different behavior for create vs. update)
- No post-processing or after-read transformation
- The gateway takes on more responsibility internally

---

### Approach 2: `additionalFields` + Storage Hooks (by phase)

Schema extension for DDL, with explicit named hooks split by read/write phase.

```typescript
gateway({
  storage: {
    dialect: GrepTimeDialect(client),
    additionalFields: {
      conversations: {
        organization_id: { type: "string", required: true },
      },
    },
    hooks: {
      onBeforeWrite: ({ operation, resource, data, state }) => {
        return { ...data, organization_id: state.organizationId };
      },
      onBeforeRead: ({ operation, resource, query, state }) => {
        return { ...query, organization_id: state.organizationId };
      },
      onAfterRead: ({ operation, resource, result, state }) => {
        // strip internal fields, enrich, audit, etc.
        return result;
      },
    },
  },
});
```

**Pros:**
- Clean separation — `additionalFields` handles DDL, hooks handle behavior
- Flexible — conditional logic per operation/resource, post-processing via `onAfterRead`
- Typed, named hooks are easy to reason about

**Cons:**
- Consumer writes explicit inject/filter logic — more boilerplate than Approach 1
- Not auto-enforced — consumer can forget to handle an operation/resource combination
- Two hooks (`onBeforeWrite` + `onBeforeRead`) must stay in sync — source of drift
- Single hook per slot (not composable like middleware)

---

### Approach 3: `additionalFields` + Storage Middleware

Schema extension for DDL, with a generic middleware chain that wraps every storage operation.

```typescript
gateway({
  storage: {
    dialect: GrepTimeDialect(client),
    additionalFields: {
      conversations: {
        organization_id: { type: "string", required: true },
      },
    },
    middleware: async ({ operation, resource, args, state, next }) => {
      const orgId = state.organizationId as string;

      if (resource === "conversation") {
        if (operation === "create" || operation === "update") {
          args.data = { ...args.data, organization_id: orgId };
        }
        if (operation === "list" || operation === "get" || operation === "delete") {
          args.where = { ...args.where, organization_id: orgId };
        }
      }

      return next();
    },
  },
});
```

Multiple middlewares can be composed as an array:

```typescript
middleware: [tenantIsolation, auditLog, rateLimiter]
```

**Pros:**
- Maximum flexibility — full control over every operation with before/after semantics
- Composable — multiple middlewares chain via `next()` (onion model)
- Single extension point for all behaviors

**Cons:**
- Most complex to implement in the gateway
- `args` is untyped per-operation — consumer must know each operation's arg shape
- Most boilerplate for simple cases
- Middleware ordering adds cognitive overhead
- Not auto-enforced — same risk of forgetting a filter as Approach 2

---

### Approach 4: `additionalFields` + Operation Hooks (Prisma-style) ⭐

Schema extension for DDL, with one hook per storage operation (`create`, `update`, `delete`, `list`, `get`). Each hook receives operation-specific args (`data`, `id`, `params`), a `query` function to call the underlying storage (Prisma naming), and a nested `context` object with request-level data like `state`. Calling `query` returns the result, so the hook can modify inputs before and transform outputs after.

```typescript
gateway({
  storage: {
    dialect: GrepTimeDialect(client),
    additionalFields: {
      conversations: {
        organization_id: { type: "string", required: true },
      },
    },
    hooks: {
      create: ({ data, context, query }) => {
        return query({ ...data, organization_id: context.state.organizationId });
      },
      update: ({ id, data, context, query }) => {
        return query(id, { ...data, organization_id: context.state.organizationId });
      },
      delete: ({ id, context, query }) => {
        return query(id, { where: { organization_id: context.state.organizationId } });
      },
      list: ({ params, context, query }) => {
        return query({
          ...params,
          where: { ...params.where, organization_id: context.state.organizationId },
        });
      },
      get: async ({ id, context, query }) => {
        const result = await query(id, {
          where: { organization_id: context.state.organizationId },
        });
        // can sanitize/strip internal fields before returning
        return result;
      },
    },
  },
});
```

Each hook receives:
- **`resource`** — `"conversation"` or `"item"`
- **`context`** — nested request-level context (`request`, `state`, etc.), kept separate from storage-level args (better-auth pattern)
- **`query`** — executes the underlying storage operation; enables post-processing of results
- **Operation-specific args** — `data` for writes, `id` for single-entity operations, `params` for listing

**Pros:**
- One hook per operation with typed args — `create` gets `data`, `delete` gets `id`, `list` gets `params`, `get` gets `id`
- Each hook wraps the full operation — modify inputs and transform outputs in one place
- No branching on operation type — each hook handles exactly one operation
- Familiar pattern (Prisma `$extends({ query })`)

**Cons:**
- Not auto-enforced — consumer can forget to handle a hook
- Not composable (single hook per slot, not a chain)
- Slightly more boilerplate than Approach 1 for simple cases

---

## Comparison

| | 1: Declarative | 2: Phase Hooks | 3: Middleware | 4: Operation Hooks |
|---|---|---|---|---|
| **Consumer effort** | None | Low-Medium | Medium | Low |
| **Safety (can't leak)** | Highest | Medium | Medium | Medium |
| **Flexibility** | Low-Medium | High | Highest | High |
| **Typed per-operation** | N/A | No (branches) | No (branches) | Yes |
| **Post-processing** | No | Yes (`onAfterRead`) | Yes | Yes (after `query()`) |
| **Before + after in one place** | N/A | No (separate hooks) | Yes | Yes |
| **Composability** | N/A | No | Yes (chain) | No |
| **Implementation effort** | Medium | Medium | High | Medium |
| **Closest analogy** | Prisma auto-inject | better-auth hooks | Express middleware | Prisma `$extends({ query })` |

---

## Shared Prerequisite

All four approaches require the same foundational change: **threading the per-request `state` from `gw.handler(request, state)` into the storage layer**. Today, storage is a context-free singleton. The `state` bag already flows through hooks — it needs to also reach storage operations.

## Open Questions

1. Should `additionalFields` also apply to `conversation_items`, or only `conversations`?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Conversation storage: schema extension and per-request scoping #83

Problem

Use Case

Proposed Approaches

`additionalFields` (shared across all approaches)

Approach 1: Declarative Fields with `resolve`

Approach 2: `additionalFields` + Storage Hooks (by phase)

Approach 3: `additionalFields` + Storage Middleware

Approach 4: `additionalFields` + Operation Hooks (Prisma-style) ⭐

Comparison

Shared Prerequisite

Open Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	1: Declarative	2: Phase Hooks	3: Middleware	4: Operation Hooks
Consumer effort	None	Low-Medium	Medium	Low
Safety (can't leak)	Highest	Medium	Medium	Medium
Flexibility	Low-Medium	High	Highest	High
Typed per-operation	N/A	No (branches)	No (branches)	Yes
Post-processing	No	Yes (`onAfterRead`)	Yes	Yes (after `query()`)
Before + after in one place	N/A	No (separate hooks)	Yes	Yes
Composability	N/A	No	Yes (chain)	No
Implementation effort	Medium	Medium	High	Medium
Closest analogy	Prisma auto-inject	better-auth hooks	Express middleware	Prisma `$extends({ query })`

Uh oh!

Conversation storage: schema extension and per-request scoping #83

Description

Problem

Use Case

Proposed Approaches

additionalFields (shared across all approaches)

Approach 1: Declarative Fields with resolve

Approach 2: additionalFields + Storage Hooks (by phase)

Approach 3: additionalFields + Storage Middleware

Approach 4: additionalFields + Operation Hooks (Prisma-style) ⭐

Comparison

Shared Prerequisite

Open Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`additionalFields` (shared across all approaches)

Approach 1: Declarative Fields with `resolve`

Approach 2: `additionalFields` + Storage Hooks (by phase)

Approach 3: `additionalFields` + Storage Middleware

Approach 4: `additionalFields` + Operation Hooks (Prisma-style) ⭐