SDO Thread refactor

        I spent some time looking and thinking about this function and, although this works, I think we are overburdening the SDO thread with error handling. I think it would be cleaner to have a separate thread for the latter. It seems to me that handling SDO read/write requests is relatively independent of handling errors such as the EMCYs from the Elmos.

A couple of points that make me think this way:
- Ideally, `sdo_thread_cond` should be an indication that the predicate (`queue_is_empty(&self->jsd_sdo_req_cirq)`, i.e. state of queue) might have changed. This is the case when we signal from `jsd_sdo_push_async_request`, but it is not the case when we signal from the EGD driver in the fault state (indirectly through `jsd_read`).
- When the signal comes from `jsd_sdo_push_request` we go over the error handling code before we can get to the point of servicing the SDO request queue, even if there are no errors present.

I am imagining something like this:

```
void* sdo_thread_loop(void* void_data)
{
  while (true)
  {
    pthread_mutex_lock(&self->jsd_sdo_req_cirq.mutex);

    while (queue_is_empty(&self->jsd_sdo_req_cirq))
      pthread_cond_wait(&self->sdo_thread_cond, &self->jsd_sdo_req_cirq.mutex);

    jsd_sdo_req_t req = queue_pop(&self->jsd_sdo_req_cirq);

    pthread_mutex_unlock(&self->jsd_sdo_req_cirq.mutex);

    if (self->sdo_join_flag) {
      return NULL;
    }

    // Submit the SDO request with ecx_SDOwrite/ecx_SDOread

    // Queue the response.
    jsd_sdo_req_cirq_push(&self->jsd_sdo_res_cirq, req);
  }
}
```
We can prescind from  `pthread_cond_timedwait` above because even if we miss a signal while servicing the one received, the queue will not be empty after that second signal, so when we get to the while loop on the predicate, we will skip the wait. 

Similarly for error handling:
```
void* error_thread_loop(void* void_data)
{
  while (true)
  {
    pthread_mutex_lock(&self->error_handling_mutex);

    while (!self->errors_available)
      // Set ts based on how frequent we want to finish servicing the errors beyond MAX_ERRORS_PER_LOOP
      pthread_cond_timedwait(&self->error_handling_cond, &self->error_handling_mutex, &ts);

    pthread_mutex_unlock(&self->error_handling_mutex);

    if (self->error_handling_join_flag) {
      return NULL;
    }

    // Parse through errors with ecx_mbxreceive, ecx_iserror, and ecx_poperror, and queue them with jsd_error_cirq_push(&self->slave_errors[err.Slave], err)

    if (handled_errors < MAX_ERRORS_PER_LOOP) {
      // We are sure we handled all errors present.
      pthread_mutex_lock(&self->error_handling_mutex);
      self->errors_available =  false;
      pthread_mutex_unlock(&self->error_handling_mutex);
    }
  }
}
```
Above, if `ecx_mbxreceive` is real-time safe, we could have a single `pthread_mutex_unlock(&self->error_handling_mutex)` at the end of the outer while loop (and one before the `return NULL`).

And in `egd.c`:
...
```
if(state->pub.actual_state_machine_state == JSD_EGD_STATE_MACHINE_STATE_FAULT){
  pthread_mutex_lock(&self->error_handling_mutex);
  self->errors_available =  true;
  pthread_cond_signal(&self->error_handling_cond);
  pthread_mutex_unlock(&self->error_handling_mutex);
```
...

In my opinion having the functions separate is simpler to read/understand and probably maintain.

_Originally posted by @d-loret in https://github.com/nasa-jpl/jsd/pull/46#discussion_r1005117180_
      

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SDO Thread refactor #56

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SDO Thread refactor #56

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions