Skip to content

[Bug] Task Cleanup Race Condition and Reconnection Issues #1103

@Vladislavert

Description

@Vladislavert

Describe the bug

Crash on Session Cleanup (ESP-IDF Backend)

Summary

When a zenoh-pico client on ESP32 loses connection to the zenohd router (e.g., router stops), the ESP32 crashes with a spinlock assertion failure during task cleanup.

Crash Log

assert failed: spinlock_acquire spinlock.h:142 (lock->count == 0)

Backtrace: 0x40376d92:0x3fcbba80 0x4037e3b1:0x3fcbbaa0 ...

0x4037ae73: xEventGroupSetBits at /idf/components/freertos/FreeRTOS-Kernel/event_groups.c:245
0x42015fda: z_task_wrapper at zenoh-pico/src/system/espidf/system.c:64

Root Cause

In src/system/espidf/system.c, z_task_wrapper calls vTaskDelete(NULL) immediately after xEventGroupSetBits.
If _z_task_free runs immediately (e.g. on another core), it frees the Event Group while xEventGroupSetBits is still using it.
This causes a crash when xEventGroupSetBits tries to access the freed memory.

Workaround

I patched src/system/espidf/system.c to use vTaskSuspend(NULL) instead of vTaskDelete(NULL), and modified _z_task_free() to properly wait for task completion before deleting resources:

// In z_task_wrapper:
vTaskSuspend(NULL);  // Instead of vTaskDelete(NULL)

// In _z_task_free:
xEventGroupWaitBits(ptr->join_event, 1, pdFALSE, pdFALSE, pdMS_TO_TICKS(500));
if (ptr->handle != NULL) { vTaskDelete(ptr->handle); ptr->handle = NULL; }
vEventGroupDelete(ptr->join_event);  // Instead of z_free(ptr->join_event)

Related Issue

This appears related to #1033.

To reproduce

  1. Flash and run the standard z_pub example on ESP32
  2. Start an instance of zenohd(docker run --rm --init --net host eclipse/zenoh:latest -l udp/0.0.0.0:7447)
  3. Stop zenohd
  4. Start zenohd again
  5. Zenoh pico will crash due to FreeRTOS attempting to acquire a spinlock on freed memory

System info

  • zenoh-pico version: 1.6.2
  • zenohd version: 1.6.2
  • Platform: ESP32-S3, ESP-IDF v5.x
  • Transport: UDP unicast

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions