Skip to content
This repository has been archived by the owner on Feb 6, 2025. It is now read-only.

Walk through of exception raising in OCaml 5 when the stack is an interleaving of C frames and OCaml frames

Olivier Nicole edited this page Aug 31, 2022 · 1 revision

With @fabbing, to reflect about how TSan exception support could be extended to show C frames in backtraces, we had to understand how raising an exceptions happens when there are interspersed OCaml and C frames. Such interspersed frames happen when OCaml makes a C call, which itself uses an OCaml callback.

signal-2022-08-31-105027_002

System (C) stack on the left, main OCaml fiber on the right. The illegible green blocks are the c_stack_link data.

The process is the following:

  1. When the OCaml program starts, the process switches from the system stack to an OCaml fiber (i.e. a heap-allocated stack). The C stack pointer is saved in Caml_state->c_stack. Before the switch, a structure of type c_stack_link is pushed on the C stack. Therefore, Caml_state->c_stack happens to point on the most recent c_stack_link. This structure has a critical but somewhat intricate role. When executing C code, it is used to store the address of the latest OCaml stack, i.e. the fiber that called into C. This is useful for DWARF unwinding as well as for exception propagation across C frames. Indeed, when an exception goes through C frames, it has to know how many C frames to pop and where the next OCaml frames to pop are. For the bookkeeping to work, each c_stack_link contains a pointer to its predecessor (or NULL when calling into OCaml for the first time). Note that, when executing OCaml code, the latest c_stack_link (which is at Caml_state->c_stack as mentioned above) is sorte of an empty shell: it is filled with NULL except for its prev pointer. Only when OCaml makes a C call is that structure filled with OCaml stack information.
  2. After switching to the OCaml stack, two words are pushed on it: a pointer to Caml_state->gc_regs which is used by callbacks during a GC and which will not be relevant for exceptions; and a pointer to the C stack. Why push the C stack pointer when it is already saved in Caml_state, you may ask? I assume this is necessary for DWARF unwinding, which cannot access data in globals but only on the stack being unwound.
  3. Before executing the OCaml code, a special exception handler is pushed on the OCaml stack. This handler performs some bit masking to inform the C runtime that the OCaml code raised an exception, instead of returning normally. (Long story short: from the C point of view, the callback raised an exception if and only if the two least significant bits of its return value are 10; taking advantage of the fact that this is always false of valid OCaml values.) It then jumps to the same code as the one executed after a normal callback return (see below).
  4. When the OCaml program makes an external call, OCaml stack location and stack pointer are saved into Caml_state->c_stack (i.e. the c_stack_link at the top of the C stack is filled) as well as into Caml_state->current_stack->sp (understand “current stack” as “most recently used OCaml stack”) and the stack pointer is pointed onto the C stack.
  5. Continuing our scenario, suppose the C code makes a callback into OCaml (as can happen with GC finalizers or explicitly using Callback.register). Then the steps of point no. 1 above are repeated identically. Indeed, there is nearly no difference between caml_start_program and caml_callback (most of their code is a shared assembly chunk), except for the fact that a callback may take arguments. Thus a new, NULL-filled c_stack_link is pushed on the C stack and the C stack pointer is saved into Caml_state; gc_regs and the C stack pointer are pushed on the OCaml stack, as well as the exception handler turning exceptions into specially marked return values; and the stack and control switch to OCaml.
  6. Suppose now that an exception is raised from OCaml. First, if backtraces are enabled, the runtime executes caml_stash_backtrace which records an execution backtrace in the backtrace buffer. caml_stash_backtrace unwinds only the OCaml stack, following fibers' parent pointers if any. As a consequence, in the backtrace it will be as if the external call never happened. (OCaml backtraces are not as complete as DWARF backtraces.)
  7. Then, the last-pushed exception handler is executed, and sets bit 1 of the exception pointer (see step 3).
  8. The saved C sp and the saved gc_regs (see step 2) are popped from the OCaml stack. The OCaml stack pointer is saved to Caml_state->current_stack->sp and rsp is pointed to the C stack. The c_stack_link at the top of the C stack is popped and the previous one is restored in Caml_state->c_stack. Now from the point of view of Caml_state, all the C frames pushed by the last external call just disappeared.
  9. The return value of the callback is recognized as an exception value, and caml_raise is called (the function used to raise from C). If Caml_state->c_stack is NULL (implying that no OCaml code called the current C code), then the program terminates with an “Uncaught exception” error. Otherwise, the Caml_state->local_roots pointer is wound back if needed (I didn't dig further in what that means), the C stack pointer is discarded, and we jump to the caml_raise_exn assembly routine. It is the usual code for raising an exception from OCaml. In other words, we just skipped over the C frames, ignoring them completely except for finding our way back to OCaml.