Skip to content

esp_http_server: Handling stale websocket connections (IDFGH-17036) #18080

@nebkat

Description

@nebkat

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

The HTTP server does not currently do a good job handling "stale" WebSocket connections where the peer has gone offline but not closed the TCP socket. We have had problems with this in our application where we send messages from the ESP32 to a browser via web sockets.

We queue up 10 messages every second to the HTTP task to be sent using httpd_ws_send_frame_async, however once the peer disappears the send operations start timing out (by default after 5 seconds).

if (httpd_ws_send_frame_async(http_handle, fd, frame) != ESP_OK) {
    httpd_sess_trigger_close(this->handle.get(), fd);
}

The httpd_sess_trigger_close internally does a httpd_queue_work(handle, httpd_sess_close, session), so that operation is placed at the end of the queue, after our now 50(!) queued up send operations, so we just get log spam forever with:

W (1764610736790) httpd_ws: httpd_ws_send_frame_async: Failed to send WS header
W (1764610741789) httpd_txrx: httpd_sock_err: error in send : 11
W (1764610741790) httpd_ws: httpd_ws_send_frame_async: Failed to send WS header
W (1764610746789) httpd_txrx: httpd_sock_err: error in send : 11
....

The simple solution we are using for now has been to add a httpd_sess_trigger_close_async function which calls httpd_sess_close directly.

But this has me wondering whether there is any reason why the server shouldn't automatically end the sessions when it gets socket send (and possibly receive) errors?


Side-note: the terminology used in the httpd functions is rather confusing:

  • httpd_ws_send_frame() is just a combination of httpd_req_to_sockfd() and httpd_ws_send_frame_async()
  • httpd_ws_send_frame_async() performs a synchronous send and must be performed on the worker task (it is just "async" because it doesn't require a HTTP request to obtain the FD)
  • httpd_ws_send_data() can be performed outside of the worker task, and blocks until the send is complete
  • httpd_ws_send_data_async() can be performed outside of the worker task but does not block
  • httpd_sess_trigger_close() can be performed outside of the worker task

IMO these should become:

  • httpd_ws_resp_frame() (httpd_ws_send_frame())
  • httpd_ws_send_frame() (httpd_ws_send_frame_async())
  • httpd_queue_ws_send_frame_blocking() (httpd_ws_send_data())
  • httpd_queue_ws_send_frame_cb() (httpd_ws_send_data_async())
  • httpd_queue_sess_close() (httpd_sess_trigger_close())
  • httpd_sess_close() (NEW httpd_sess_trigger_close_async())

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions