Skip to content

sdk: typed prerun/postrun#551

Merged
dbentley-vsql merged 4 commits into
mainfrom
dbentley/prerun_api_wrapping
May 22, 2026
Merged

sdk: typed prerun/postrun#551
dbentley-vsql merged 4 commits into
mainfrom
dbentley/prerun_api_wrapping

Conversation

@dbentley-vsql

@dbentley-vsql dbentley-vsql commented May 20, 2026

Copy link
Copy Markdown
Member

sdk: typed prerun/postrun hooks

Introduce typed C++ wrappers for the per-statement lifecycle hooks
(prerun and postrun). Extension authors no longer touch raw ABI
structs to write these hooks; only the typed signatures are accepted.

Why

Extension authors writing prerun/postrun previously had to:

void my_prerun(vef_context_t *, vef_prerun_args_t *args,
vef_prerun_result_t *result) {
if (args->arg_count == 0) {
result->type = VEF_RESULT_ERROR;
snprintf(result->error_msg, VEF_MAX_ERROR_LEN, "...");
return;
}
result->result_buffer_size = N;
result->user_data = new MyState{};
}

Two problems: raw ABI in the user-facing API (against the SDK's
direction; see #548) and bespoke error-reporting plumbing.

After this PR:

void my_prerun(vsql::PrerunArgs args, vsql::PrerunResult out) {
if (args.size() == 0) {
out.error("at least one argument required");
return;
}
out.request_buffer_size(N);
out.set_user_data(new MyState{});
}

void my_postrun(vsql::PostrunArgs args) {
args.delete_state();
}

void my_vdf(MyState &state, IntResult out) { ... }

State lifetime stays explicit in this PR: prerun stashes the pointer,
postrun frees it. A follow-up will add auto-cleanup (see "Not in this
PR" below).

Where to start reviewing

Read the layers in this order:

  1. villagesql/sdk/include/villagesql/vsql/pre_post_run.h
    New public types: PrerunArgs, PrerunArgType, PrerunResult,
    PostrunArgs. This is what extension authors see.

  2. villagesql/examples/vsql-simple/src/extension.cc
    The ba_call_index demo: set_user_data in prerun, mutable State&
    in the VDF, delete_state in postrun.

  3. mysql-test/suite/villagesql/extension/t/extension_simple_type_usage.test
    What it looks like at the SQL surface.

  4. villagesql/sdk/include/villagesql/extension.h
    Refreshed user-facing docs for the typed hook shape.

The remaining file is SDK plumbing:

  1. villagesql/sdk/include/villagesql/vsql/func_builder.h
    • .prerun<>() / .postrun<>() are typed-only; static_assert
      rejects raw vef_prerun_func_t / vef_postrun_func_t.
    • WrapperTypedState / WrapperVoidState dispatch on the first
      param of the VDF: State& / const State& / void* selects how
      user_data is forwarded.
    • typed_prerun_thunk / typed_postrun_thunk adapt the user's
      typed signature to the raw ABI shape the server invokes.

Not in this PR

  • Auto-cleanup for SDK-allocated state. Requires an ABI side channel
    (e.g. a user_data_deleter slot on vef_prerun_result_t) so the SDK
    can register a destructor independent of the user_data pointer
    that extensions also use for raw/custom allocations. Marked
    TODO(villagesql-beta) in pre_post_run.h.
  • A .state<T>() builder that records State at the template level
    and routes typed signatures of the form void(T&, ...). Will land
    with the auto-cleanup mechanism above.
  • Typed VarArgs/AnyArg wrappers for varargs VDFs (the other raw-ABI
    place in extension code, only reached via PR sdk: prerun+varargs #254).
  • vef_postrun_result_t is empty in the ABI today, so there is no
    PostrunResult wrapper. Will add when the struct grows.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

@dbentley-vsql dbentley-vsql force-pushed the dbentley/prerun_api_wrapping branch 2 times, most recently from 42a3436 to 787481d Compare May 20, 2026 17:10
@dbentley-vsql dbentley-vsql changed the title pre/postrun improvements sdk: typed prerun/postrun + emplace May 20, 2026
@dbentley-vsql dbentley-vsql marked this pull request as ready for review May 20, 2026 17:12
@dbentley-vsql dbentley-vsql force-pushed the dbentley/prerun_api_wrapping branch from 787481d to 1363ce3 Compare May 20, 2026 17:19
@dbentley-vsql dbentley-vsql changed the title sdk: typed prerun/postrun + emplace sdk: typed prerun/postrun May 21, 2026
@dbentley-vsql dbentley-vsql mentioned this pull request May 21, 2026
@dbentley-vsql dbentley-vsql force-pushed the dbentley/prerun_api_wrapping branch from 330a160 to 711fc10 Compare May 21, 2026 13:04
vef_vdf_result_t *result,
std::index_sequence<Is...>) {
using Params = typename FuncParamTypes<decltype(Func)>::type;
State &state = *static_cast<State *>(args->user_data);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A VDF whose first parameter is State& or void* can be registered without .prerun(), but this wrapper unconditionally dereferences args->user_data. That makes the new public SDK shape a runtime null-deref footgun. Can we reject state-parameter signatures unless a prerun/state setup is attached, or defer this shape until the managed .state<T>() API exists?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes for State&; I believe not for void*.

For void*, it's plausible to imagine something like "I might not need user_data, so I won't write it in pre_run, but will in the per-row function".

return PrerunArgType(&a_->arg_types[i]);
}

// Returns the serialized constant bytes if argument i is a literal,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrerunArgs::const_at() is exposed/documented as returning literal argument bytes, but the server currently initializes vef_prerun_args_t::const_values and const_lengths to nullptr in vdf_handler.cc, so this API always returns std::nullopt. Either populate those arrays in prerun setup, or hide/remove const_at() until it is implemented.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.


namespace vsql {

// =============================================================================

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repo guidance says not to add section-separator comments like this. Can we remove these separator blocks from the new header?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed; thanks.

@tomas-villagesql

tomas-villagesql commented May 21, 2026

Copy link
Copy Markdown
Member

Dan, what do you think about something like this:

  make_func<&ba_call_index>("ba_call_index")
      .returns(INT)
      .state<CallCounter>()
      .prerun<&ba_call_index_prerun>()
      .postrun<&ba_call_index_postrun>()
      .build()

where SDK manages the new/delete?

user:

  void ba_prerun(PrerunArgs args, PrerunResult out, CallCounter &state) {
    state.n = 10;
  }

  void ba_postrun(PostrunArgs args, CallCounter &state) {
    // flush metrics, close non-owned handles, etc.
  }

SDK:

Then the SDK-generated raw prerun does this order:

  auto state = std::make_unique<CallCounter>();

  PrerunResult out(raw_result);
  out.set_state_pointer(state.get()); // internal/private SDK write

  user_prerun(PrerunArgs(raw_args), out, *state); // optional shape

  raw_result->user_data = state.release();

And SDK-generated raw postrun does:

  std::unique_ptr<CallCounter> state(
      static_cast<CallCounter *>(raw_args->user_data));

  user_postrun(PostrunArgs(raw_args), *state); // optional shape

  // state auto-deletes here

// or anything that doesn't fit emplace_state<T>. The wrapper forwards
// args->user_data straight through.
//
// Param tuple shape: <void*, TypedArg..., ResultWrapper>. Same indexing as

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come State (the type above) can't just be void* in this case? Does something break in the casting?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WrapperVoidState is importantly different because it does not deref args->user_data.

As I said to tomas's comment:

For void*, it's plausible to imagine something like "I might not need user_data, so I won't write it in pre_run, but will in the per-row function".

@dbentley-vsql dbentley-vsql force-pushed the dbentley/prerun_api_wrapping branch 2 times, most recently from a624b5c to 4c7d041 Compare May 21, 2026 18:50
@dbentley-vsql

Copy link
Copy Markdown
Member Author

Dan, what do you think about something like this:

  make_func<&ba_call_index>("ba_call_index")
      .returns(INT)
      .state<CallCounter>()
      .prerun<&ba_call_index_prerun>()
      .postrun<&ba_call_index_postrun>()
      .build()

where SDK manages the new/delete?

user:

  void ba_prerun(PrerunArgs args, PrerunResult out, CallCounter &state) {
    state.n = 10;
  }

  void ba_postrun(PostrunArgs args, CallCounter &state) {
    // flush metrics, close non-owned handles, etc.
  }

SDK:

Then the SDK-generated raw prerun does this order:

  auto state = std::make_unique<CallCounter>();

  PrerunResult out(raw_result);
  out.set_state_pointer(state.get()); // internal/private SDK write

  user_prerun(PrerunArgs(raw_args), out, *state); // optional shape

  raw_result->user_data = state.release();

And SDK-generated raw postrun does:

  std::unique_ptr<CallCounter> state(
      static_cast<CallCounter *>(raw_args->user_data));

  user_postrun(PostrunArgs(raw_args), *state); // optional shape

  // state auto-deletes here

Yes, I want to do this; I'm just trying to keep the change smaller.

In fact, I think we can do even better and just pass along a function to delete instead of a fully-general postrun function altogether.

}
};

// Thunks that convert a typed user prerun/postrun (void(PrerunArgs,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know "Thunk" is a term of art, but we have never used that term before; why introduce it now?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to Wrapper

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add anything to this file. This is the v1 API that we are trying to get rid of.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

@dbentley-vsql dbentley-vsql force-pushed the dbentley/prerun_api_wrapping branch from 4c7d041 to 02be86e Compare May 22, 2026 02:41
dbentley-vsql added a commit that referenced this pull request May 22, 2026
NB: this is analogous to #254 , but rebased and simplified

sdk: typed varargs + required explicit arity

Introduce typed C++ wrappers for varargs VDFs (vsql::VarArgs, vsql::AnyArg)
and the matching .varargs() / zero-arity .param() builder methods, so an
extension author can write a variadic SQL function without touching raw ABI
types. Also tightens the v2 builder to require an explicit arity
declaration before .build().

What changes for users
Before, an extension author who wanted a varargs SQL function had to drop
out of the typed SDK and write a raw-ABI body — the typed wrappers covered
fixed-arity only:

// Old: raw vef_vdf_func_t signature, manual invalue extraction
void ba_concat_all(vef_context_t *ctx, vef_vdf_args_t *args,
                   vef_vdf_result_t *result) {
  for (unsigned int i = 0; i < args->value_count; i++) {
    vef_invalue_t v = vsql::func_builder::get_invalue(ctx, args, i);
    if (v.is_null) { result->type = VEF_RESULT_NULL; return; }
    memcpy(result->str_buf + i * N, v.bin_value, N);
  }
  result->type = VEF_RESULT_VALUE;
  result->actual_len = args->value_count * N;
}
After:

// New: typed VarArgs + AnyArg, no raw ABI in the body
void ba_concat_all(vsql::VarArgs args, vsql::StringResult out) {
  auto dst = out.buffer();
  size_t off = 0;
  for (auto a : args) {
    if (a.is_null()) { out.set_null(); return; }
    auto bytes = a.as_custom();
    memcpy(dst.data() + off, bytes.data(), bytes.size());
    off += bytes.size();
  }
  out.set_length(off);
}
Registration picks up .varargs() (and zero-arity .param()):

.func(make_func<&ba_len>("ba_len")
          .returns(INT)
          .param()                     // explicit zero-arity
          .build())
.func(make_func<&ba_concat_all>("ba_concat_all")
          .returns(STRING)
          .varargs()
          .prerun<&ba_concat_all_prerun>()
          .build())
The prerun hook is still raw ABI in this PR; the typed variant lands with
#551 (typed vsql::PrerunArgs / PrerunResult), and ba_concat_all_prerun
will convert to typed form when that merges.

Required-arity change
The v2 builder now rejects make_func<&fn>("name").returns(...).build()
without an arity call. You must pick exactly one of:

.param(TYPE) (one or more, for fixed-arity typed args)
.param() (zero-arity)
.varargs()
Eight in-tree test-extensions had implicit zero-arity functions and pick
up an explicit .param() in this PR. The v1 builder
(villagesql::func_builder, raw-ABI) is unchanged — that cow has left
the barn.
Introduce typed C++ wrappers for the per-statement lifecycle hooks
(prerun and postrun). Extension authors no longer touch raw ABI
structs to write these hooks; only the typed signatures are accepted.

Why
---
Extension authors writing prerun/postrun previously had to:

  void my_prerun(vef_context_t *, vef_prerun_args_t *args,
                 vef_prerun_result_t *result) {
    if (args->arg_count == 0) {
      result->type = VEF_RESULT_ERROR;
      snprintf(result->error_msg, VEF_MAX_ERROR_LEN, "...");
      return;
    }
    result->result_buffer_size = N;
    result->user_data = new MyState{};
  }

Two problems: raw ABI in the user-facing API (against the SDK's
direction; see #548) and bespoke error-reporting plumbing.

After this PR:

  void my_prerun(vsql::PrerunArgs args, vsql::PrerunResult out) {
    if (args.size() == 0) {
      out.error("at least one argument required");
      return;
    }
    out.request_buffer_size(N);
    out.set_user_data(new MyState{});
  }

  void my_postrun(vsql::PostrunArgs args) {
    args.delete_state<MyState>();
  }

  void my_vdf(MyState &state, IntResult out) { ... }

State lifetime stays explicit in this PR: prerun stashes the pointer,
postrun frees it. A follow-up will add auto-cleanup via .state<T>()
(implementable without ABI changes; see TODO in pre_post_run.h).

Where to start reviewing
------------------------
Read the layers in this order:

1. villagesql/sdk/include/villagesql/vsql/pre_post_run.h
   New public types: PrerunArgs, PrerunArgType, PrerunResult,
   PostrunArgs. This is what extension authors see.

2. villagesql/examples/vsql-simple/src/extension.cc
   The ba_call_index demo: set_user_data in prerun, mutable State&
   in the VDF, delete_state in postrun.

3. mysql-test/suite/villagesql/extension/t/extension_simple_type_usage.test
   What it looks like at the SQL surface.

4. villagesql/sdk/include/villagesql/extension.h
   Refreshed user-facing docs for the typed hook shape.

The remaining file is SDK plumbing:

5. villagesql/sdk/include/villagesql/vsql/func_builder.h
   - .prerun<>() / .postrun<>() are typed-only; static_assert
     rejects raw vef_prerun_func_t / vef_postrun_func_t.
   - FuncBuilder gained a HasPrerun template flag; .prerun<>() flips
     it true, and build() static_asserts that void(State&,...) and
     void(void*,...) signatures require it. Registering a state-style
     VDF without a prerun is a compile error, not a runtime null deref.
   - WrapperTypedState / WrapperVoidState dispatch on the first
     param of the VDF: State& / const State& / void* selects how
     user_data is forwarded.
   - typed_prerun_thunk / typed_postrun_thunk adapt the user's
     typed signature to the raw ABI shape the server invokes.

Not in this PR
--------------
- A `.state<T>()` builder that records State at the template level
  and routes typed signatures of the form `void(T&, ...)`. The SDK
  would install both prerun and postrun thunks to manage the typed
  state's lifetime via the existing user_data slot — no ABI side
  channel needed. Will land as a follow-up.
- Typed VarArgs/AnyArg wrappers for varargs VDFs (the other raw-ABI
  place in extension code, only reached via PR #254).
- vef_postrun_result_t is empty in the ABI today, so there is no
  PostrunResult wrapper. Will add when the struct grows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dbentley-vsql dbentley-vsql force-pushed the dbentley/prerun_api_wrapping branch from 267bb9e to 5dbadce Compare May 22, 2026 14:18
@dbentley-vsql dbentley-vsql force-pushed the dbentley/prerun_api_wrapping branch from 5dbadce to 13a8452 Compare May 22, 2026 14:25
@dbentley-vsql dbentley-vsql merged commit f64fdab into main May 22, 2026
3 checks passed
@dbentley-vsql dbentley-vsql deleted the dbentley/prerun_api_wrapping branch May 22, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants