Skip to content

Conversation

@jimmyw
Copy link
Contributor

@jimmyw jimmyw commented Oct 30, 2025

Description

Its easy to think that an write to an uart will always succeed eventually, but that not always the case. If you enable hardware flow control, and have someone on the receiving end, that is stubborn never to release the ReadyToReceive line.

In some cases this has caused a deadlock, when both devices are both sitting and waiting for the other to give up.


Note

Introduces uart_write_bytes_with_timeout() and refactors TX handling to support timeouts and partial-byte reporting across FIFO and ring buffer paths.

  • Driver (UART):
    • New API: uart_write_bytes_with_timeout(uart_port_t, const void*, size_t, TickType_t) declared in include/driver/uart.h and implemented in src/uart.c.
    • TX Path Refactor:
      • Internal uart_tx_all() now takes TickType_t ticks_to_wait, enforces timeouts on tx_mux, ring buffer acquire, FIFO semaphores, and break wait; returns bytes sent or -1 on timeout.
      • Ring buffer flow switched to xRingbufferSendAcquire/xRingbufferSendComplete with explicit description item, uses current free size to chunk writes, tracks bytes_transmitted.
      • FIFO (no TX buffer) path honors ticks_to_wait and supports partial sends.
    • API Wiring: uart_write_bytes() and uart_write_bytes_with_break() updated to call uart_tx_all(..., portMAX_DELAY); new uart_write_bytes_with_timeout() forwards caller-specified timeout.
    • Docs: Header comments updated to describe timeout behavior and return values.

Written by Cursor Bugbot for commit 6dd58b8. This will update automatically on new commits. Configure here.

@github-actions
Copy link

github-actions bot commented Oct 30, 2025

Messages
📖 🎉 Good Job! All checks are passing!

👋 Hello jimmyw, we appreciate your contribution to this project!


📘 Please review the project's Contributions Guide for key guidelines on code, documentation, testing, and more.

🖊️ Please also make sure you have read and signed the Contributor License Agreement for this project.

Click to see more instructions ...


This automated output is generated by the PR linter DangerJS, which checks if your Pull Request meets the project's requirements and helps you fix potential issues.

DangerJS is triggered with each push event to a Pull Request and modify the contents of this comment.

Please consider the following:
- Danger mainly focuses on the PR structure and formatting and can't understand the meaning behind your code or changes.
- Danger is not a substitute for human code reviews; it's still important to request a code review from your colleagues.
- To manually retry these Danger checks, please navigate to the Actions tab and re-run last Danger workflow.

Review and merge process you can expect ...


We do welcome contributions in the form of bug reports, feature requests and pull requests via this public GitHub repository.

This GitHub project is public mirror of our internal git repository

1. An internal issue has been created for the PR, we assign it to the relevant engineer.
2. They review the PR and either approve it or ask you for changes or clarifications.
3. Once the GitHub PR is approved, we synchronize it into our internal git repository.
4. In the internal git repository we do the final review, collect approvals from core owners and make sure all the automated tests are passing.
- At this point we may do some adjustments to the proposed change, or extend it by adding tests or documentation.
5. If the change is approved and passes the tests it is merged into the default branch.
5. On next sync from the internal git repository merged change will appear in this public GitHub repository.

Generated by 🚫 dangerJS against 7af818b

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on November 20

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

@github-actions github-actions bot changed the title feat(esp_driver_uart): Add uart_write_bytes_with_timeout() feat(esp_driver_uart): Add uart_write_bytes_with_timeout() (IDFGH-16708) Oct 30, 2025
@espressif-bot espressif-bot added the Status: Opened Issue is new label Oct 30, 2025
@jimmyw jimmyw force-pushed the jimmyw/uart_write_with_timeout branch from 9914847 to bcb8e15 Compare November 21, 2025 12:21
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on December 20

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

@jimmyw jimmyw force-pushed the jimmyw/uart_write_with_timeout branch from bcb8e15 to 80046b1 Compare November 22, 2025 21:45
xRingbufferSendComplete(p_uart_obj[uart_num]->tx_ring_buf, tx_data);

// We checked the available space before, so the acquire should always succeed here
xRingbufferSend(p_uart_obj[uart_num]->tx_ring_buf, (void *)(src + offset), send_size, portMAX_DELAY);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Timeout not respected for data chunk transmission

The buffered TX path uses portMAX_DELAY when sending data chunks to the ring buffer, ignoring the user-specified ticks_to_wait. If the ring buffer unexpectedly blocks (due to a race condition or implementation bug), the function hangs forever instead of timing out as the API promises. This makes uart_write_bytes_with_timeout() unable to guarantee timeout behavior when using a TX buffer.

Fix in Cursor Fix in Web

@jimmyw jimmyw marked this pull request as draft November 24, 2025 11:11
@jimmyw jimmyw force-pushed the jimmyw/uart_write_with_timeout branch 2 times, most recently from a971748 to 4eb65e6 Compare November 24, 2025 11:23
This makes sure we wont break between header and data, fixes a proposed
spinlock, and simplifies locking.
@jimmyw jimmyw force-pushed the jimmyw/uart_write_with_timeout branch from 4eb65e6 to 2d7d921 Compare November 24, 2025 11:43
Avoids locking if the uart is blocked due to flow control.
@jimmyw jimmyw force-pushed the jimmyw/uart_write_with_timeout branch from 2d7d921 to 7af818b Compare November 24, 2025 12:21
@songruo
Copy link
Collaborator

songruo commented Nov 27, 2025

Hi @jimmyw,

Thanks for raising the PR!

Is there a specific reason why you want to change the way of data sending to atomic (header+payload) style? I am afraid the logic change is too big for the current UART driver. Also, the atomic (header+payload) way would take more space in the ring buffer, since every data chunk needs a header now.

It is reasonable to introduce a new API to have the timeout parameter to allow the bytes writing to return when getting blocked. However, this can be achieved in a relatively simple way: by just changing the portMAX_DELAY in uart_tx_all function to the timeout parameter, right?

@jimmyw
Copy link
Contributor Author

jimmyw commented Nov 27, 2025

@songruo

Hi, thanks for your reply.

Yes, there is a very good reason for doing header + data in one atomic operation, and that is if you manage to fit a header in the buffer, and and not data (Due to timeout) the uart will get of and think that the next header you add is the data, as it expects a data after header.

I also think the code is cleaner, and more understandable this way, and with less overhead as we do less locking.

It wont take much more buffer space, every write is currently an header + data today, even if you only write 1 byte to the uart. Its only when you write very big chunks in the buffer, it will be split up in 4 parts just to improve buffer stale lock handling. you dont want to fill up the buffer with a single piece of data, as that will dma block the full messagebuffer, and a less prioritized process wont be able to feed it in time, causing a buffer under-run.

I added now so if the buffer have less space free, then whats we like to add to the buffer, it will split that into two, having the first write non blocking, and the second part will obviously block until timeout. This is an optimization, that we can remove.

I am all for alternative suggestions on how to solve this.

Just to make clear what i am trying to do with this PR, i give you an example (not completely sure this will happen in practice, but we hit a similar scenario):

lets say we run PPP over a modem with flow control, LWIP stack wants to write 1500bytes to the uart, but the modem is busy with other stuff, not having a timeout on uart write will make the complete LWIP stack hang until the modem says so. And when it says, the tcp timeout on what we tried to send to the modem has passed already, so that would eventally be discarded anyway.

Adding a timeout, makes LWIP notice that package was not send, and it can go on with other stuff, retrying later or dropping the package. Today with a blocking operation, it will sitt there wait 4 ever.

uart_hal_ena_intr_mask(&(uart_context[uart_num].hal), UART_INTR_TX_BRK_DONE);
UART_EXIT_CRITICAL_ISR(&(uart_context[uart_num].spinlock));
p_uart->tx_waiting_brk = 1;
p_uart->tx_brk_flg = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set p_uart->tx_brk_flg back to 0 at here, then UART_INTR_TXFIFO_EMPTY interrupt won't be able to re-enable after finishing send the TX break?

// Limit an item to a quarter of the ring buffer size, if we allow a full item size the ring buffer
// needs to completely empty before we can send another large item, which can cause stalls.
// also makes sure that we dont try to add an item larger than the ring buffer itself
acquire_size = MIN(acquire_size, xRingbufferGetMaxItemSize(p_uart_obj[uart_num]->tx_ring_buf) / 4);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
acquire_size = MIN(acquire_size, xRingbufferGetMaxItemSize(p_uart_obj[uart_num]->tx_ring_buf) / 4);
size_t max_one_time_send_size = xRingbufferGetMaxItemSize(p_uart_obj[uart_num]->tx_ring_buf) / 4;
acquire_size = MIN(acquire_size, max_one_time_send_size);

@songruo
Copy link
Collaborator

songruo commented Nov 28, 2025

@jimmyw You are right, header + data in one atomic operation is necessary if we want to add the timeout option. The PR generally LGTM, I could sync it to our internal repo, but I don't guarantee it will be accepted due to the large UART TX logic change involved. We need to do some internal discussion and heavy testings to ensure no breaking change will be introduced.

@jimmyw
Copy link
Contributor Author

jimmyw commented Nov 28, 2025

Thank you @songruo, i also did not expect adding a timeout would need this complex change. But the flip coin is that.

  1. Somewhat simplified code.
  2. Less locking and task switching overhead just using one entry per write should improve.
  3. Improved buffer utilization spitting writes that does not fit.
  4. Less stalls on big writes, as we dont need to wait for the full write to fit buffer.
  5. The tx_lock can also be removed now, if you only send from a single task. This brings the locks down from 3 to 1 writing a small buf.
  6. A timeout on send.

Think these are good reasons you could bring with you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Opened Issue is new

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants