|
| 1 | +Component architecture: Fixed execution order framework |
| 2 | +======================================================= |
| 3 | + |
| 4 | +See [Contribution Request Guideline](https://eclipse-score.github.io/score/process/guidance/contribution_request/index.html) and [Feature Request Template](https://eclipse-score.github.io/score/process/guidance/contribution_request/templates/feature_request_template.html). |
| 5 | + |
| 6 | +General |
| 7 | +------- |
| 8 | + |
| 9 | +* FEO shall be a framework for applications |
| 10 | +* For data-driven and time-driven applications (mainly in the ADAS domain) |
| 11 | +* Supporting fixed execution order |
| 12 | +* Supporting reprocessing |
| 13 | + |
| 14 | +Applications |
| 15 | +------------ |
| 16 | + |
| 17 | +* The framework is used to build applications |
| 18 | +* Multiple applications based on the framework can run in parallel on the same host machine |
| 19 | +* Applications based on the framework can run in parallel with other applications not based on the framework |
| 20 | +* The framework does not support communication between different applications (except via service activities, see below) |
| 21 | + |
| 22 | +Activities |
| 23 | +---------- |
| 24 | + |
| 25 | +* Applications consist of activities |
| 26 | +* Activities are a means to structure applications into building blocks |
| 27 | +* Activities have init(), step() and shutdown() entry points |
| 28 | +* The framework provides the following APIs to the activities running on it: |
| 29 | + - Read time (feo::time) |
| 30 | + - Communicate to other activities (feo::com) |
| 31 | + - Log (feo::log) |
| 32 | + - Configuration parameters (feo::param) |
| 33 | + - Persistency (feo::pers) |
| 34 | +* There are two types of activities: |
| 35 | + - Application activities |
| 36 | + - Service activities |
| 37 | +* Application activities must only use APIs provided by the framework as defined above |
| 38 | +* Application activities are single threaded, they can not run outside of their entry points, |
| 39 | + they must not spawn other threads or processs |
| 40 | +* Activities can be implemented in C++ or Rust, mixed systems with both |
| 41 | + C++ and Rust activities are possible |
| 42 | + |
| 43 | + |
| 44 | +Service Activities |
| 45 | +------------------ |
| 46 | + |
| 47 | +* Service activities are a means to interact with the outside world, e.g. via |
| 48 | + network communication, direct sensor input or direct actuator output |
| 49 | +* Service activities may also use APIs external to the framework |
| 50 | + (e.g. networking APIs, reading from external sensor devices, writing HW I/O, etc.) |
| 51 | +* Service activities run at the beginning ("input service activity") and at the end |
| 52 | + ("output service activity") of a tash chain (see below) |
| 53 | +* Input service activities provide the input values to the application activities |
| 54 | + within the task chain, by means of communication |
| 55 | +* All input service activities must finish execution before the first application activity |
| 56 | + is run. this can be achieved by proper setup of the chain dependencies (see below) |
| 57 | +* There must be at least one input service activity |
| 58 | +* Output service activities consume output values from the application activities |
| 59 | + calculated within the task chain an provide them to the outside world |
| 60 | +* All output service activities must run after all application service activities have |
| 61 | + finished execution. this is achieved by proper setup of the chain dependencies (see below) |
| 62 | +* There must be at least one output service activity |
| 63 | + |
| 64 | + |
| 65 | +Communication |
| 66 | +------------- |
| 67 | + |
| 68 | +* Application type activities can only communicate to other activities within |
| 69 | + the same application and using the provided communication API |
| 70 | +* Communication consists of sending and receiving messages on named topics |
| 71 | +* The receiver of a message on a topic does not know the sender, instead it only |
| 72 | + relies on the message itself independent of the source of the message |
| 73 | +* There can only be one sender per topic but multiple receivers |
| 74 | +* Optional: there can be multiple senders per topic |
| 75 | +* There is no publish/subscribe mechanism acessible to activities, instead |
| 76 | + the set of known communication topics and the assignment of which activity |
| 77 | + sends and receives to/from which topic is "runtime static" |
| 78 | +* "runtime static" means "static after the startup phase", i.e. during startup, the |
| 79 | + framework can configure or build up communication connections, but as soon as the |
| 80 | + run phase starts (where the activties' step() functions are called), the connections |
| 81 | + are fixed and will not change any more. |
| 82 | +* Communication relations are typically configured in configuration files |
| 83 | +* Messages/topics are statically typed |
| 84 | +* Only messages of the matching type can be sent/received on a specific topic |
| 85 | +* The binary representation of messages is defined by the framework in order |
| 86 | + to support communication between activities implemented in different |
| 87 | + languages (C++/Rust) |
| 88 | +* Message types may be primitive types or complex (nested) types |
| 89 | +* Complex types can be built by using structs and arrays of types |
| 90 | +* Sending a message by an activity involves the following steps: |
| 91 | + - Call API to acquire a handle to a message buffer for a certain topic |
| 92 | + - Fill data into the provided memory buffer |
| 93 | + - Call API to send the message |
| 94 | +* Reception of a message by an activity involves the following steps: |
| 95 | + - Use API to receive message from a certain topic, this returns a handle to a data buffer |
| 96 | + - Read message data from data buffer |
| 97 | +* The receiver can not modify the message, the framework will enforce this, |
| 98 | + for example by using read-only types or by configuring memory protect of the OS |
| 99 | + |
| 100 | +Queuing: |
| 101 | +* Queuing can be enabled per topic, a queue of length N means that the last |
| 102 | + N messages are kept for a specific topic |
| 103 | +* Receivers have access to the last N elements, reading an element from the |
| 104 | + queue by a receiver doesn't change the queue, i.e. doesn't remove it from the queue. |
| 105 | + instead all receiver will always see the last N elements |
| 106 | +* Optional: a queue pointer to the element last read is maintained per receiver. |
| 107 | + however, the queue with its buffers still only exists once per topic. if one receiver |
| 108 | + receives an element from the queue, its queue pointer is incremented so that next |
| 109 | + time it reads the next element, this does not affect the queue pointers of other receivers |
| 110 | +* Queue enable and queue length are "runtime static" configuration settings |
| 111 | + |
| 112 | + |
| 113 | +Process/Thread Mapping |
| 114 | +---------------------- |
| 115 | + |
| 116 | +* An application consists of one or more processes |
| 117 | +* One of the processes is the primary process |
| 118 | +* If there is more than one process, the other processes are secondary processes |
| 119 | +* There can be one or more threads per process |
| 120 | +* The number of processes and threads is statically defined and |
| 121 | + does not change once the application has been started (runtime static) |
| 122 | +* Activities are statically mapped to threads within processes within the application |
| 123 | +* There can be multiple activities mapped to the same thread |
| 124 | + |
| 125 | +* There is one executable per process, so an application may consist of multiple executables |
| 126 | +* Each executable contains part of this framework as well as the activities mapped to the |
| 127 | + corresponding process |
| 128 | +* It is assumed that an external entity starts all the executables belonging to the |
| 129 | + same application. the reason for this is that for security reasons, only very |
| 130 | + specific entities should have the ability to create processes |
| 131 | +* The executables belonging to an application are grouped (e.g. in the filesystem) so that |
| 132 | + it's clear that they belong together |
| 133 | +* One reason for having multiple processes per application is to |
| 134 | + achieve Freedom From Interference for safety relevant applications |
| 135 | + |
| 136 | + |
| 137 | +Lifecycle |
| 138 | +--------- |
| 139 | + |
| 140 | +* The lifecycle of an application consists of 3 phases: |
| 141 | + - startup phase |
| 142 | + - run phase |
| 143 | + - shutdown phase |
| 144 | +* During startup phase, the primary proces connects with the secondary processes |
| 145 | + (if present), in order to: |
| 146 | + - Build up connections for communication (e.g. find shared memory segments |
| 147 | + provided/consumed) |
| 148 | + - Connect to the parameter service |
| 149 | + - Coordinate the init and later the shutdown process |
| 150 | + - Coordinate the execution of the task chain (see below) |
| 151 | +* During the shutdown phase, the primary process coordinates the shutdown of |
| 152 | + all secondary processes |
| 153 | +* The connection between primary and secondary processes is kept up as long as the |
| 154 | + application is running |
| 155 | +* If the connection breaks down unexpectedly while the application is running, |
| 156 | + the involved processes terminate (either by a command from the primary process |
| 157 | + or by detecting connection loss to the primary process) |
| 158 | + |
| 159 | +Activity Init: |
| 160 | +* At the end of the startup phase, the framework will invoke the init() entry point |
| 161 | + of each activity |
| 162 | +* The init() entry point will be invoked in the thread the activity is mapped to |
| 163 | +* The order of invoking the init() entry points across activities is not defined, |
| 164 | + invocation may happen in parallel or sequentially |
| 165 | + |
| 166 | +Activity Shutdown: |
| 167 | +* At the beginning of the shutdown phase, the framework will invoke the shutdown() |
| 168 | + entry point of each application |
| 169 | +* The shutdown() entry point will be invoked in the thread the activity is mapped to |
| 170 | +* The order of invoking the shutdown() entry points across activities is not defined, |
| 171 | + invocation may happen in parallel or sequentially |
| 172 | + |
| 173 | + |
| 174 | +Scheduling |
| 175 | +---------- |
| 176 | + |
| 177 | +* Activities are arranged in a task chain |
| 178 | +* There is exactly one task chain per application |
| 179 | +* The task chain describes the execution order of the activities in the run phase |
| 180 | +* Task chains run cyclically, e.g. every 30ms |
| 181 | +* Optional: task chains can be triggerd on event |
| 182 | +* All activities are executed once per task chain run |
| 183 | +* All activities finish within a single task chain run |
| 184 | +* Running an activity means that the framework is calling its step() function |
| 185 | + within the process/thread it has been mapped to |
| 186 | +* The execution order is defined by a dependency model: |
| 187 | + - Each activity can depend on N other activities in the same task chain |
| 188 | + - An activity's step() function gets called as soon as the step() |
| 189 | + functions of the activities it depends on have been called |
| 190 | +* The framework takes care to run the activities in this order, |
| 191 | + independent of the thread/process the activity is mapped to |
| 192 | +* While the order is guaranteed, there is no guarantee that an activity is |
| 193 | + run immediately after all its dependencies have finished. |
| 194 | + for example if two activities mapped to the same thread are ready to run |
| 195 | + at the same time, they can still only run one after the other |
| 196 | +* Note however, that for a particular (static) setup of threads, processes |
| 197 | + and activity mapping, the invocation delay is deterministic |
| 198 | + (apart from differences in the activity execution times) |
| 199 | +* The execution order and the exact point in time when an activity is run |
| 200 | + is independent of any communication an activity might do |
| 201 | +* The dependencies should be defined by the application developer in a way so that |
| 202 | + processing results passed via communication are available when they are needed |
| 203 | + (if an activity needs an output of another activity it sets that other |
| 204 | + activity as its dependency and therefore will only run once the other one |
| 205 | + is finished and therefore has produced the results the first one needs) |
| 206 | + |
| 207 | + |
| 208 | +Executor and Agents |
| 209 | +-------------------- |
| 210 | + |
| 211 | +* The coordinating entity in the primary process is the "executor" |
| 212 | +* The executor coordinates the invocation of the activities in the |
| 213 | + order as described above |
| 214 | +* As a central entity the executor is able to trace, record or monitor the |
| 215 | + system behavior as sequence of activity invocations (see below) |
| 216 | +* The actual activity invocation is done by an "agent" |
| 217 | +* The agent exists in each process belonging to an application |
| 218 | +* The agent connects to the executor during the startup phase |
| 219 | +* The agent take invocation commands sent by the executor and |
| 220 | + executes them in its local process on behalf of the executor |
| 221 | + |
| 222 | + |
| 223 | +External state |
| 224 | +-------------- |
| 225 | + |
| 226 | +* Depending on the reprocessing scenario (see below) it might be necessary |
| 227 | + to put the activities into a well defined state. This can either be done |
| 228 | + by providing all the input to the activities which they need to get |
| 229 | + into that state (which could involve many task chain invocations). |
| 230 | + another way is to let the framework record activity state just as it |
| 231 | + records communication messages |
| 232 | +* External state is a means to make activity state recordable |
| 233 | +* Using external state, activities don't hold their state in activity local |
| 234 | + variables (like C++ member variables) but in a state storage provided |
| 235 | + by the framework. this way, they "do not remember anything" from the |
| 236 | + last task chain invocation. instead, on every new task chain invocation, |
| 237 | + they first read in the external state from the framework provided storage, |
| 238 | + then potentially manipulate the state based on their inputs and then |
| 239 | + store it back for the next task chain invocation |
| 240 | + |
| 241 | + |
| 242 | +Tracing |
| 243 | +------- |
| 244 | + |
| 245 | +* The framework can record all messages going over its communication topics |
| 246 | +* For each message the recording includes: |
| 247 | + - topic |
| 248 | + - data |
| 249 | + - timestamp |
| 250 | + - sender [optional] |
| 251 | +* The framework can record certain execution events: |
| 252 | + - task chain start/end |
| 253 | + - init/step/shutdown() entry point enter per activity |
| 254 | + - init/step/shutdown() entry point leave per activity |
| 255 | +* For each event the recording includes: |
| 256 | + - type (e.g. step_enter) |
| 257 | + - context (e.g. activity name of step() entered) |
| 258 | + - timestamp |
| 259 | + |
| 260 | + |
| 261 | +Reprocessing |
| 262 | +------------ |
| 263 | + |
| 264 | +* There are multiple possible reprocessing scenarios, for example: |
| 265 | + - replay of one or many executions of a task chain |
| 266 | + - replay of one or many executions of a single activity |
| 267 | +* In a replay scenario, the framework is used to reproduce the communication messages |
| 268 | + and other API behavior (e.g. time, parameters, persistency) as was |
| 269 | + recorded in a previous run |
| 270 | +* In case a whole task chain is reprocessed, the outputs of the input service activites |
| 271 | + will be reproduced |
| 272 | +* In case only a single activity is reprocessed, the outputs of the predecessors |
| 273 | + in the task chain will be reproduced |
| 274 | +* Outputs of application activities are typically not replayed but |
| 275 | + freshly calculated by the activities running during the replay |
| 276 | +* The framework supports reprocessing by |
| 277 | + - Starting a task chain at the same point in time as recorded |
| 278 | + - Replaying communication data as recorded |
| 279 | + - Providing time via its time API as recorded |
| 280 | + |
| 281 | + |
0 commit comments