Skip to content

Should ordering of implicit page table reads after the implicit read of branch entries be preserved in the global order? #2307

@Kritzefitz

Description

@Kritzefitz

Section 12.2.1. Supervisor Memory-Management Fence Instruction (commit 0e1a870) states:

Implementations must only perform implicit reads of the translation data structures pointed to by the current contents of the satp register or a subsequent valid (V=1) translation data structure entry [...].

This is important for the following example pseudo-code, that initializes an existing PTE as a branch pointing to a freshly allocated page table:

// pte initially points to a PTE with V=0
// page_table points to a 4096 byte RW memory region with unspecified content
page_table = allocate_page();
// Initializes the entire page table with zeroes or some other known desired state
init_page_table(page_table);
// Write the page as a branch into an existing pte, setting among other things V=1
write_branch_pte(pte, page_table);
// Make the new table visible to subsequent implicit accesses
sfence.vma x0,x0

Since the content of page_table is unpredictable before init_page_table is called, it is important that implicit reads do not access page_table before the initialization. Otherwise the unpredictable state of the table might expose privileged memory regions to unprivileged code with access to that address space. If we assume that the hart executing this is the only hart which references this pte through satp (which is trivially true on single-hart systems), this is given, since the setting of V=1 in the pte happens after the branch is known to be initialized in program order.

Things might become a little more complicated, if the pte might be concurrently referenced through another hart's satp register. A simple sfence.vma won't be enough, so we replace it with a handwavy sfence_vma_cross_hart that properly synchronizes with all other harts (or at least those that currently might refer to the pte through satp) and causes them to execute an appropriate sfence.vma. Also there's the problem that the order of init_page_table and write_branch_pte is currently not necessarily preserved in the global memory order, but this can be fixed by separating them with a fence w,w. This would lead us to this updated example.

page_table = allocate_page();
init_page_table(page_table);
// Ensure the initialization is ordered in the global order before setting V=1 in the pte 
fence w,w
write_branch_pte(pte, page_table);
// Execute an sfence.vma on all other harts
sfence_vma_cross_hart()

Unfortunately, given the current information in the specification, I'm not certain, that this actually prevents other harts from implicitly accessing page_table before it is initialized. it seems clear to me that the passage I quoted initially is meant to imply that implicit accesses to the page_table are ordered after the implicit access to pte that observes the address of page_table and V=1, but it is not clear to me if that ordering is only implied in the local program order or if it is also preserved in the global order.

Let's assume the implicit read on the pte (which observes V=1) is called implicit_read_pte and the subsequent read of the page table is called read_page_table. If we then assume that the ordering between these is preserved in the global memory order, then the only valid global order between the operations involved would be:

  1. init_page_table
  2. write_branch_pte
  3. read_pte
  4. read_page_table

This would be fine and our example above would be safe. However, if we assume that the ordering is not preserved in the global order, an execution with this global order would be legal:

  1. read_page_table // Oh no! We observed the page table before initialization!
  2. init_page_table
  3. write_branch_pte
  4. read_pte

This would obviously be bad, so we would need to introduce additional ordering constraints to rule out this kind of execution. If read_pte and read_page_table were regular explicit memory operations, we could maybe rely on some kind of syntactic dependency between them or if that fails, we can separate them with a fence r,r. But since these are implicit accesses and their behavior is controlled by the implementation, we don't really have a way to control these things directly. The only reasonable solution I could come up with would be the following variant of the above example:

page_table = allocate_page();
init_page_table(page_table);
// Ensure that implicit accesses on other harts (which may later observe pte.V=1) are no longer allowed to observe the uninitialized page table
sfence_vma_cross_hart()
write_branch_pte(pte, page_table);
sfence_vma_cross_hart()

While this solution should be safe for both interpretations of the spec, it also seems vastly more expensive than the previous example, since sfence_vma_cross_hart will have to interrupt a potentially large number of harts and wait for them to complete their fences.

Could it be clarified in the spec, if the ordering between these kinds of implicit accesses is guaranteed to be preserved in the global order? No matter what the answer is here, it seems critical to me that the behavior is clearly documented and pointed out, because:

  • if preserving the order is not required, software that assumes that it is preserved will have severe vulnerabilities
  • if preserving the order is required, an implementation that assumes it is not required will introduce severe vulnerabilities into compliant software

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions