Skip to content

Imported document about undefined behavior and safe api in LLD #127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
7f54804
Imported document about undefined behavior and safe api in LLD
pellico Dec 9, 2024
a2e9df0
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Dec 11, 2024
5069e27
Fixed all issue found by PLeVasseur
pellico Dec 13, 2024
8429bda
Merge branch 'rust_safety_low_level_driver' of ssh://ssh.github.com:4…
pellico Dec 13, 2024
7382789
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Dec 18, 2024
451b280
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Dec 18, 2024
3c661b3
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Dec 18, 2024
9e5d64c
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Dec 19, 2024
bd4d172
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Dec 19, 2024
327ce08
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Dec 19, 2024
59df9ff
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Dec 19, 2024
9a61384
Merge branch 'rustfoundation:main' into rust_safety_low_level_driver
pellico Dec 19, 2024
12138c0
Improved documentation in response to comment from @vapdrs
pellico Dec 19, 2024
d25e4e8
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Jan 31, 2025
c0cf6e9
Merge remote-tracking branch 'origin/main' into rust_safety_low_level…
pellico Jan 31, 2025
af68cad
Improved description of contradiction raised by justification of safe…
pellico Jan 31, 2025
edf5d25
Merge remote-tracking branch 'origin/main' into rust_safety_low_level…
pellico Jan 31, 2025
7412ede
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Feb 14, 2025
7545958
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Feb 14, 2025
16383cb
Update subcommittee/coding-guidelines/initiatives/safe-use-of-unsafe-…
pellico Feb 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,290 @@
# Rust Embedded Low Level Drivers (LLD) and Safe Definition

- [Rust Embedded Low Level Drivers (LLD) and Safe Definition](#markdown-header-rust-embedded-low-level-drivers-lld-and-safe-definition)
- [Introduction](#markdown-header-introduction)
- [What does mean safe for Rust language.](#markdown-header-what-does-mean-safe-for-rust-language)
- [What is “undefined behavior” in Rust?](#markdown-header-what-is-undefined-behavior-in-rust)
- [Hardware undefined behavior](#markdown-header-hardware-undefined-behavior)
- [Rust safe/unsafe are not well defined when hardware is involved.](#markdown-header-rust-safeunsafe-are-not-well-defined-when-hardware-is-involved)
- [Svd2rust](#markdown-header-svd2rust)
- [Is safe and unsafe not applicable to external world?](#markdown-header-is-safe-and-unsafe-not-applicable-to-external-world)
- [Proposal](#markdown-header-proposal)
- [Example](#markdown-header-example)
- [Conclusions](#markdown-header-conclusions)

## Introduction

Any real system is not just composed by a CPUs, memories and software.
In a real system there are other hardware components that shall interact
with software. They are usually called peripherals.

Peripheral examples:

- UART
- GPU
- DMA
- Ethernet
- Timers
- Watchdog

Peripherals are typically configured and operated by reading memory
mapped registers or in some cases using special CPU instructions.

Peripheral configuration and operation could be complex and cumbersome
and for this reason an embedded software developer doesn't interact
directly with registers but through a low-level drivers. Low-level
drivers typically don`t support portability across different type of
microcontrollers but they implement a minimal logic to configure and
operate properly the device accordingly with microcontroller or
microprocessor specification.

Rust developers expect to [sound](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library) libraries.

When I tried to understand what need to be done to offer safe low level
drivers, I found that there is no clear definition of what does mean
`safe` or `unsafe` when hardware is involved.

In this document, I try to analyze the reason of this unclarity and
propose a solution.

Before to proceed to this analysis we need to understand what does mean
`safe` for Rust language:

## What does mean safe for Rust language.

Rust is considered a safe language and the terms `safe` and `unsafe` are
widely used in its documentation.

What does mean `safe` in the context of Rust? I was not able to find a
good definition of what is safe in Rust but there is quite clear
definition of what shall be considered `unsafe` in Rust.

In Rust Ferrocene specification is written the following [^1]:

*Unsafety is the presence of unsafe operations in program text.*

*An unsafe operation is an operation that may result in undefined
behavior that is not diagnosed as a static error. Unsafe operations are
referred to as unsafe Rust.*

This definition is aligned with what is written in Rustnomicon[^2]:

*No matter what, Safe Rust can't cause Undefined Behavior.*

Moreover, the keyword `unsafe` is used in multiple context[^3]:

- *unsafe fn*: calling this function means abiding by a contract the
compiler cannot enforce.

- *unsafe trait*: implementing the trait means abiding by a contract
the compiler cannot enforce.

- *unsafe {}*: the contract necessary to call the operations inside
the block has been checked by the programmer and is guaranteed to be
respected.

- *unsafe impl*: the contract necessary to implement the trait has
been checked by the programmer and is guaranteed to be respected.

This discussion only covers unsafe blocks and unsafe functions. Unsafe
functions can be called only in unsafe blocks and the user shall ensure
to use unsafe functions fulfilling all requirements described in the
documentation to avoid undefined behavior.

This simplifies code review and bug identification. The developers shall
inspect first unsafe blocks in case the application shows an undefined
behavior. However, this doesn`t exclude that a function developer forgot
to mark a function as “unsafe”.

### What is “undefined behavior” in Rust?

Ferrocene specification defines it as “unbounded error”[^4] or in other
words: everything could happen.

Rust Reference manual provide a non-exhaustive list of “undefined
behaviors”[^5] The following behavior are worth mentioning because it
involves hardware:

- *Executing code compiled with platform features that the current
platform does not support (see target\_feature), except if the
platform explicitly documents this to be safe.*

Before discussing safe/unsafe in Low Level Drivers let's see the meaning
of “undefined behavior” in the context of microcontroller/microprocessor
hardware.

## Hardware undefined behavior

In the data sheets or user manuals of many micro-controllers and
microprocessors the term *undefined behavior* is used mostly to describe
how to avoid the conditions that could generate an undefined behavior.

What does mean *undefined behavior* in the context of microcontrollers
or microprocessor?

- undefined behavior is not tested in simulation and in production.

- Any behavior could happen.

- undefined behavior can be different from device to device.

- undefined behavior can be different at different instance of time.

The last two characteristic it is very true for mixed signal peripherals
and for clock system of microcontroller due to sensitivity to physical
conditions of device and surrounding environments (e.g., variance of
process manufacturing, operating temperature…). Therefore, undefined
behavior is the worst-case scenario and there is no software test that
can detect it.

It shall be highlighted that an undefined behavior could be triggered
not only by wrong value of registers but also by wrong order of reading
and writing of registers.

## Rust safe/unsafe are not well defined when hardware is involved.

From previous discussion it could appear obvious that safe low level
drivers shall protect from any misuses that could trigger undefined
behavior even from hardware.

Unfortunately, this is not generally accepted and there are different
understandings. Let`s list them:

### Svd2rust

Svd2Rust[^6] generated peripheral access crates (PAC) from xml register
description. Peripheral access crates provides API to write and read
peripheral registers.

Safe functions provided by generated crates protect users from following
issue:

- Writing out of range value for bitfields.

- Concurrent access to same register from different interrupt/thread

- Read or write registers that cannot be read or written.

_These functions cannot protect from undefined
behavior triggered by improper order of calling functions_.

e.g. Clock initialization, peripheral initialization.

This approach seems conflicting with other approach:

- In Embedonomicon is described how to create safe API for DMA low
level driver[^7]. Why shall I take all these measures if all my
register access API are safe?

- All C library all functions are by considered by default unsafe. Why
a very low level access to external HW function is considered safer
than accessing a C library.

On svd2rust and safety there is an ongoing discussion in an issue of
GitHub project[^8] related to this topic.

### Is safe and unsafe not applicable to external world?

I have talked with some developers, and they don`t consider relevant HW
undefined behavior because are not part of Rust developed software. In
other words, they assume that unsafety shall be restricted to only rust
code memory issues, thread issue but not to HW component.

This position is similar to the one regarding the safety of *fs::open*
in Unix system[^9]. The *fs::open* is considered safe function even if
it can open special file */proc/self/mem* that allows unchecked full
access to process memory. Moreover, it is stated the following in the
documentation:

*Rust's safety guarantees only cover what the program itself can do, and
not what entities outside the program can do to it.*

Previous sentence is conflicting with the following:

- C API of external libraries are by default unsafe.

- HW undefined behavior is already included in Rust Reference:
“*Executing code compiled with platform features that the current
platform does not support*”.

- Measures required for safe DMA API in Embednomicon. DMA is external
to Rust program but measures are taken to have safe API.

## Proposal

I think it is quite clear that a clear unambiguous and widely accepted
definition of undefined behavior in Rust language especially when HW is
involved is missing.

While waiting for a vastly better definition in an upcoming Rust
specification, I would propose the following to solve this issue:

- HW undefined behavior shall be considered when developing low level
drivers.

- In the documentation of safe API shall be stated the scope of
safety: which part of the system is free of undefined behavior.

### Example

Let`s assume a system where a microcontroller is connected to a voltage
regulator that is the power supply of microcontroller and a software
running in microcontroller changes the voltage and clock based on
required system mode.

<p align="center">
<img src="./rust-embedded-lld-safe-definitions-images/safe_lld_fig1.png" alt="Figure 1 HW System" height="200">
</p>

Software is structured as in the below picture. All components provide
safe API.

<p align="center">
<img src="./rust-embedded-lld-safe-definitions-images/safe_lld_fig2.png" alt="Figure 2 SW architecture" height="200">
</p>

For each software component shall be documented which part of system is
considered regarding undefined behavior.

Microcontroller low level driver should provide safe API only
considering just the microcontroller but excluding the voltage regulator
because they should be generic and usable for any system.

<p align="center">
<img src="./rust-embedded-lld-safe-definitions-images/safe_lld_fig3.png" alt="Figure 3 Scope of microcontroller LLD" height="200">
</p>

Voltage regulator driver requires a SPI low level driver in order and
their combination shall protect from undefined behavior that could arise
from microcontroller + voltage regulator. Voltage regulator driver is
supposed to be generic and independent of the device connected to the
output of voltage regulator.
<p align="center">
<img src="./rust-embedded-lld-safe-definitions-images/safe_lld_fig4.png" alt="Figure 4 Scope of voltage regulator driver and microcontroller LLD" height="200">
</p>
Finally, system mode manager API shall guarantee the freedom from
undefined behavior of full system as described in Figure 1 HW system.

## Conclusions

In this document I tried to show why the measures required to have safe
LLD are not clear. The reason derives from the definition of “undefined
behavior”. This is a call for Rust community and Rust specification
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view, we should be careful to delineate those things which the Rust Project and most of the Rust community would say lead to "undefined behavior" from a particular (and important!) segment which cares about safety-critical and writing low-level drivers.

I think that this involves some advocacy on our part as the Safety-Critical Rust Consortium Coding Guidelines Subcommittee.

As we have chatted about a few times, this seems to be a bit of a "scoping" or "system" issue where as you show here it depends on how you view the scope of your system and how it can do things which lead to "undefined behavior".

I'd recommend rewriting this section here to emphasize perhaps this "community" nature, since as you highlight the svd2rust crate here (which is maintained by the Tools team) is a first major collaborator in this mission you're embarking on.

Copy link
Author

@pellico pellico Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PLeVasseur This was an old document that I though could be useful as food for thought and I have no problem to change.

The main point that I tried to highlight is that there is an inconsistency in the community regarding the definition of undefined behaviors and when use or not use unsafe keyword.

This inconsistency in the scope of safety this inconsistency can lead to different interpretation and therefore safety risk.

I would propose to rewrite the document in following way:

  • Highlight the inconsistency/unclarity of unsafe usage and undefined behavior definition across Rust community
  • Remove any reference to svd2rust
  • Remove all reference to HW peripheral (I will mention only interface to C Low Level Driver) and consider a more generic external world (FFI)
  • Remove the section about solution proposal and call for a community solution, better definition in Rust upcoming specification or just a guidance in coding guidelines.
  • Highlight the risk of this ambiguity in the context of safety application.

Is this addressing your recommendation ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've got the start of a useful document here to drive discussion, for sure.

The aspect that I think you address pretty well and could be due more emphasis is what we often look at in a safety-critical context, which is of system scoping, i.e. where the safety issues could crop up.

While in a vacuum calling a certain API that's FFI or directly accessing memory or manipulating hardware may not cause a safety concern from the point of view, it's within the greater context of what a system is trying to accomplish that something may result in undefined behavior. Your diagrams and explanations I think go to that point.

The main point that I tried to highlight is that there is an inconsistency in the community regarding the definition of undefined behaviors and when use or not use unsafe keyword.

My main challenge with the document as-is is in this framing. I think that within the context of the Rust programming language the understanding of undefined behavior is ~reasonably well-defined.

However, when we take a function call that's doing something which in and of itself doesn't cause undefined behavior, but due to the ordering of the function called with respect to another function, that's where this system scoping framing can be useful.

e.g.

  • Calling a foo() function: defined
  • Calling foo(), then bar() functions: defined
  • Calling bar(), then foo() functions: undefined behavior

Therefore, when doing a low-level driver that should accomplish foo() + bar() we will consider the system scope as surrounding both of them to ensure they're called in the correct order.

I think it's admirable to say that low-level software that's still fairly close to manipulating hardware should not be unsafe! I'd like to make sure it's being motivated and communicated in this way.

writers to have an unambiguous and well-defined description of
“undefined behavior” that includes HW drivers.

I described a proposal inspired by ISO26262 out of context element that
is applicable to LLD and still preserve one of the main advantages of
Rust:

*No matter what, Safe Rust can't cause Undefined Behavior.*

[^1]: https://spec.ferrocene.dev/unsafety.html
[^2]: https://doc.rust-lang.org/nomicon/safe-unsafe-meaning.html
[^3]: https://doc.rust-lang.org/std/keyword.unsafe.html#the-different-meanings-of-unsafe
[^4]: https://spec.ferrocene.dev/glossary.html#term_undefined_behavior
[^5]: https://doc.rust-lang.org/reference/behavior-considered-undefined.html
[^6]: https://docs.rs/svd2rust/latest/svd2rust/
[^7]: https://docs.rust-embedded.org/embedonomicon/dma.html
[^8]: https://github.com/rust-embedded/svd2rust/issues/714
[^9]: https://doc.rust-lang.org/std/os/unix/io/index.html#procselfmem-and-similar-os-features
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.