Skip to content

[DashboardClientX] Apply httplib timeouts; wire setReceiveTimeout()#518

Open
Vinicius-T-Robotics wants to merge 10 commits into
UniversalRobots:masterfrom
Vinicius-T-Robotics:polyscope-x-apply-httplib-timeouts
Open

[DashboardClientX] Apply httplib timeouts; wire setReceiveTimeout()#518
Vinicius-T-Robotics wants to merge 10 commits into
UniversalRobots:masterfrom
Vinicius-T-Robotics:polyscope-x-apply-httplib-timeouts

Conversation

@Vinicius-T-Robotics

@Vinicius-T-Robotics Vinicius-T-Robotics commented Jun 11, 2026

Copy link
Copy Markdown

cpp-httplib's default read_timeout is 300 seconds. Because the PolyScope X DashboardClientImplX constructor never overrides any timeout, every dashboard call (commandPowerOff, commandStop, commandPlay, etc.) blocks for the full 300 s when the controller becomes unresponsive — e.g. a network partition or paused container in tests. Symptoms include thread-pool exhaustion in callers that dispatch dashboard commands from a bounded executor pool.

Two changes:

  1. Apply explicit set_connection_timeout(5s), set_read_timeout(1s), and set_write_timeout(5s) in the constructor. The 1 s read default matches the contract already documented on this impl: getConfiguredReceiveTimeout() returns {tv_sec = 1, tv_usec = 0}.

  2. Wire setReceiveTimeout(const timeval&) into the underlying httplib::Client instead of being a [[maybe_unused]] no-op. This makes the base-class API work as documented on the X path; callers can now tune the read timeout per their needs.

Both changes are backward-compatible: there is no public API change, and the new defaults align with what the impl already advertised via getConfiguredReceiveTimeout().


Note

Medium Risk
Changes default blocking behavior for all PolyScope X dashboard HTTP calls and power-on timing; mis-tuned timeouts could cause new failures on slow robot operations.

Overview
PolyScope X dashboard HTTP calls no longer inherit cpp-httplib’s 300 s read/write defaults. DashboardClientImplX now applies 5 s connection and 10 s read/write timeouts at construction, keeps them in recv_timeout_ / send_timeout_, and setReceiveTimeout / setSendTimeout update the underlying httplib::Client.

The base DashboardClientImpl API gains getConfiguredSendTimeout / setSendTimeout (default getter still 1 s on non-overriding impls). getConfiguredReceiveTimeout on the X path returns the stored value (10 s default) instead of a fixed 1 s.

commandPowerOn temporarily raises the read timeout to the caller’s timeout argument for the power-on PUT, then restores the previous setting (including on exceptions).

Reviewed by Cursor Bugbot for commit 8b9e0f4. Bugbot is set up for automated code reviews on this repo. Configure here.

cpp-httplib's default read_timeout is 300 seconds. Because the PolyScope X
DashboardClientImplX constructor never overrides any timeout, every dashboard
call (commandPowerOff, commandStop, commandPlay, etc.) blocks for the full
300 s when the controller becomes unresponsive — e.g. a network partition
or paused container in tests. Symptoms include thread-pool exhaustion in
callers that dispatch dashboard commands from a bounded executor pool.

Two changes:

1. Apply explicit set_connection_timeout(5s), set_read_timeout(1s), and
   set_write_timeout(5s) in the constructor. The 1 s read default matches
   the contract already documented on this impl: getConfiguredReceiveTimeout()
   returns {tv_sec = 1, tv_usec = 0}.

2. Wire setReceiveTimeout(const timeval&) into the underlying httplib::Client
   instead of being a [[maybe_unused]] no-op. This makes the base-class API
   work as documented on the X path; callers can now tune the read timeout
   per their needs.

Both changes are backward-compatible: there is no public API change, and the
new defaults align with what the impl already advertised via
getConfiguredReceiveTimeout().
@Vinicius-T-Robotics

Copy link
Copy Markdown
Author

Hi @urfeex — would you be willing to take a look at this when you have a moment? Small fix for the dashboard 300 s hang on PolyScope X. Thanks!

Comment thread src/ur/dashboard_client_implementation_x.cpp Outdated
Address review feedback: the new 1 s default applied in the constructor was
in effect during the initial openapi.json GET in connect(), unlike the prior
300 s default which gave that first read plenty of headroom.

Mirror the G5 pattern (dashboard_client_implementation_g5.cpp connect()):
save the configured timeout, temporarily bump the read timeout to 10 s for
the setup GET, then restore the configured value before returning. Also
consolidate the exit paths so the restore runs on every outcome.
Comment thread src/ur/dashboard_client_implementation_x.cpp Outdated
Comment thread src/ur/dashboard_client_implementation_x.cpp Outdated
Address review feedback: json::parse and VersionInformation::fromString can
throw on malformed or unexpected openapi.json payloads. Without exception
handling, the restore call at the end of connect() would be skipped and the
client would be left on the 10 s setup read timeout.

Wrap the body in try / catch(...) and restore the configured timeout before
rethrowing, so the configured value is honored on every exit path.
Address review feedback: setReceiveTimeout previously only updated the
httplib client, while getConfiguredReceiveTimeout always returned the
hardcoded 1 s default. connect() saves and restores via the latter, so any
custom read timeout the caller set before (or between) connects was
silently overwritten with 1 s.

Mirror the G5 pattern: store the caller-provided value in a member
recv_timeout_ (null until explicitly set), and have getConfiguredReceiveTimeout
return that value when present, falling back to the 1 s documented default
otherwise. Now connect()'s save/restore round-trips the caller's value
correctly.

@urfeex urfeex left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this. However, I think to get this to a mergeable state, we would have to properly implement the concept of timeouts. The current getConfiguredReceiveTimeout is only a placeholder. We should fill that with life and probably also add a mechanism to alter the write timeout.

power_on would have to be updated to actually use the timeout parameter and reset to the default timeout afterwards, as done in G5.

@Vinicius-T-Robotics would you be up for extending this PR accordingly?


Edit: I see that while I was writing this, you have been addressing some of the points already :-)

// partition, paused container in tests). Align with the contract documented on the base
// class: getConfiguredReceiveTimeout() returns {tv_sec = 1, tv_usec = 0} on this impl.
cli_->set_connection_timeout(std::chrono::seconds(5));
cli_->set_read_timeout(std::chrono::seconds(1));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point we should probably make the read timeout available as in the G5 implementation. the GetConfiguredReceiveTimeout method was so far only a dummy implementation. With these changes, I think it makes sense to keep a timeout member around. As far as I can see, cpp-httplib doesn't allow accessing the timeout, so we would have to keep track here.

We would also have to be sensitive about blocking calls such as power_on and brake_release that may take quite some time on real hardware. For that reason power_on has a timeout parameter already, but brake_release doesn't.

Other calls that might hit a timeout are

  • loading programs (read timeout)
  • importing (uploading) programs (write timeout)
  • Updating programs (write timeout)
  • Downloading programs (read timeout)

Using a higher default timeout (e.g. 10 seconds) might cover most of those cases already.

Addresses upstream review on the httplib timeout PR:

  - Add setSendTimeout / getConfiguredSendTimeout to the base class
    (DashboardClientImpl), mirroring the existing read-timeout API.
    getConfiguredSendTimeout has a non-pure default that returns the
    documented 1 s value, so DashboardClientImplG5 needs no change.
  - Override both in DashboardClientImplX with a send_timeout_ member
    parallel to recv_timeout_. The setter updates the underlying
    httplib::Client; the getter falls back to the 10 s constructor
    default until a caller configures it explicitly.
  - Bump the constructor's read/write timeouts from 1 s / 5 s to
    10 s / 10 s. The 1 s read default was too tight for legitimate
    blocking calls (brake_release, commandLoadProgram, program
    upload/update/download). The 10 s figure covers all non-power-on
    commands without per-method timeouts; callers can override via
    setReceiveTimeout / setSendTimeout.
  - Honor the timeout parameter on commandPowerOn using the same
    save-bump-restore pattern as connect(): cache the configured
    read timeout, set it to the caller-supplied value for the PUT,
    restore on both success and exception (catch(...) rethrow).
Comment thread src/ur/dashboard_client_implementation_x.cpp Outdated
Comment thread src/ur/dashboard_client_implementation_x.cpp
Address review feedback: connect() unconditionally set the setup-phase
read timeout to 10 s, even when getConfiguredReceiveTimeout() already
returned a larger value. That temporarily shrank the in-flight read
timeout during the openapi.json GET.

The worst case is the lazy-connect path inside commandPowerOn: power on
calls setReceiveTimeout(timeout) (typically 300 s by default), then
issues a PUT. If robot_api_version_ is still empty, put() lazy-calls
connect(), which would previously drop the 300 s back to 10 s for the
setup GET — exactly when the controller is booting and the GET most
needs the larger budget.

Treat the 10 s as a minimum-headroom for the setup GET, not a cap on
the caller's preferences: start from the configured value and only bump
up to 10 s if it is smaller. The save/restore around the GET still
honors whatever the caller had configured before connect() was entered.
Address review feedback: commandPowerOn previously mapped its
std::chrono::duration<double> argument to timeval via
duration_cast<seconds>, which truncates toward zero, and hardcoded
tv_usec = 0. A caller passing e.g. 2.5 s would silently get a 2 s
read timeout.

Cast to microseconds (the smallest unit timeval can represent) and
split into tv_sec + tv_usec so the fractional portion survives.
@Vinicius-T-Robotics

Copy link
Copy Markdown
Author

Thanks a lot for this. However, I think to get this to a mergeable state, we would have to properly implement the concept of timeouts. The current getConfiguredReceiveTimeout is only a placeholder. We should fill that with life and probably also add a mechanism to alter the write timeout.

power_on would have to be updated to actually use the timeout parameter and reset to the default timeout afterwards, as done in G5.

@Vinicius-T-Robotics would you be up for extending this PR accordingly?

Edit: I see that while I was writing this, you have been addressing some of the points already :-)

Hello, have added the write timeout as well in 241c1a2
Let me know what you think!

Comment thread src/ur/dashboard_client_implementation_x.cpp

@shafaq-trener shafaq-trener left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with a comment

Comment thread include/ur_client_library/ur/dashboard_client_implementation_x.h Outdated
Comment thread src/ur/dashboard_client_implementation_x.cpp Outdated
The std::unique_ptr<timeval> indirection was carried over from
DashboardClientImplG5, where it composes with TCPSocket's existing
unique_ptr<timeval> member. In DashboardClientImplX it bought nothing:
no nullable semantics are needed because the constructor now installs
a 10 s default that every getter falls back to anyway.

Replace recv_timeout_ and send_timeout_ with value-typed timeval
members defaulted to {10, 0}. setReceiveTimeout / setSendTimeout
assign directly; the getters return the member. Behavior is
preserved (verified end-to-end against a stalling-server test that
times connect() at the configured value).
Reviewer feedback: the save-bump-restore in connect() mimics G5's
pattern but for the wrong reason. G5 bumps because it does an
unsolicited blocking read for the dashboard server's boot-up
message after TCPSocket::setup() — explicitly called out in the G5
source ("The first read after connection can take more time").
PolyScope X's connect() just issues an HTTP GET for /openapi.json
(~200 ms over VPN per maintainer measurement); no analogous
unsolicited read exists.

The original cursor-bot concern that motivated b1f900d was that the
1 s constructor read_timeout left openapi.json under-resourced
while in-tree callers raised the timeout only post-connect. That
concern was resolved by bumping the constructor default from 1 s to
10 s, which gives openapi.json the same headroom the example wrapper
applies manually. The save-bump-restore became dead weight: a no-op
for default callers, and it overrides explicit caller values shorter
than 10 s — paternalistic behavior the reviewer is rightly pushing
back on.

Drop the bump, the try/catch(...) rethrow guard, and the restore.
connect() returns to its pre-b1f900d shape (early returns from the
GET-and-parse pipeline). The recv_timeout_ storage stays because
commandPowerOn still uses it for its own per-call save-bump-restore,
which is a different and legitimate case (caller-supplied longer
deadline for an operation known to take longer).
// time on real hardware — brake_release, commandLoadProgram (read), commandUploadProgram
// (write), commandUpdateProgram (write), commandDownloadProgram (read) — without forcing
// each one to plumb its own timeout. commandPowerOn already accepts its own (longer)
// timeout parameter. Callers needing different limits can override via setReceiveTimeout.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Receive timeout ignores write uploads

Medium Severity

The new constructor comment tells callers to raise limits via setReceiveTimeout, but commandUploadProgram and commandUpdateProgram are bounded by httplib’s write timeout. This change adds setSendTimeout on the X implementation and sets a 10s write default, while DashboardClient still only forwards receive timeout, so tuning read timeout will not help upload/write failures.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit e84c769. Configure here.

Comment thread src/ur/dashboard_client_implementation_x.cpp Outdated
Comment thread src/ur/dashboard_client_implementation_x.cpp Outdated
Co-authored-by: Felix Exner <feex@universal-robots.com>
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 17.14286% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.36%. Comparing base (ca25b13) to head (8b9e0f4).

Files with missing lines Patch % Lines
src/ur/dashboard_client_implementation_x.cpp 20.68% 22 Missing and 1 partial ⚠️
...lient_library/ur/dashboard_client_implementation.h 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #518      +/-   ##
==========================================
- Coverage   78.85%   78.36%   -0.50%     
==========================================
  Files         116      115       -1     
  Lines        6612     6637      +25     
  Branches     2920     2927       +7     
==========================================
- Hits         5214     5201      -13     
- Misses       1031     1068      +37     
- Partials      367      368       +1     
Flag Coverage Δ
check_version_ur10-3.15.8 11.33% <0.00%> (-1.58%) ⬇️
check_version_ur10e-10.11.0 11.33% <0.00%> (-0.15%) ⬇️
check_version_ur10e-5.15.2 11.33% <0.00%> (-0.47%) ⬇️
check_version_ur12e-10.12.1 11.33% <0.00%> (-0.10%) ⬇️
check_version_ur12e-5.25.1 11.33% <0.00%> (-0.47%) ⬇️
check_version_ur15-10.12.1 11.38% <0.00%> (-0.01%) ⬇️
check_version_ur15-5.25.1 11.33% <0.00%> (-0.10%) ⬇️
check_version_ur16e-10.12.1 11.33% <0.00%> (-0.10%) ⬇️
check_version_ur16e-5.25.1 11.33% <0.00%> (-0.25%) ⬇️
check_version_ur18-10.12.1 11.33% <0.00%> (-0.10%) ⬇️
check_version_ur18-5.25.1 12.11% <0.00%> (+0.48%) ⬆️
check_version_ur20-10.12.1 11.33% <0.00%> (-0.10%) ⬇️
check_version_ur20-5.25.1 11.33% <0.00%> (-0.06%) ⬇️
check_version_ur3-3.14.3 11.75% <0.00%> (-0.11%) ⬇️
check_version_ur30-10.12.1 11.33% <0.00%> (-0.15%) ⬇️
check_version_ur30-5.25.1 12.16% <0.00%> (+0.53%) ⬆️
check_version_ur3e-10.11.0 11.33% <0.00%> (-0.06%) ⬇️
check_version_ur3e-5.9.4 11.38% <0.00%> (-0.01%) ⬇️
check_version_ur5-3.15.8 10.71% <0.00%> (-1.51%) ⬇️
check_version_ur5e-10.11.0 11.38% <0.00%> (-0.01%) ⬇️
check_version_ur5e-5.12.8 11.52% <0.00%> (+0.04%) ⬆️
check_version_ur7e-10.11.0 11.33% <0.00%> (-0.06%) ⬇️
check_version_ur7e-5.22.2 11.33% <0.00%> (-0.06%) ⬇️
check_version_ur8long-10.12.1 11.33% <0.00%> (-0.10%) ⬇️
check_version_ur8long-5.25.1 11.38% <0.00%> (-0.20%) ⬇️
python_scripts 75.90% <ø> (ø)
start_ursim 82.11% <ø> (-3.09%) ⬇️
ur5-3.14.3 74.18% <0.00%> (-0.45%) ⬇️
ur5e-10.11.0 68.75% <17.14%> (-0.25%) ⬇️
ur5e-10.12.0 69.68% <17.14%> (-0.41%) ⬇️
ur5e-10.7.0 68.02% <0.00%> (-0.66%) ⬇️
ur5e-5.9.4 74.88% <0.00%> (-0.34%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 8b9e0f4. Configure here.

Comment thread src/ur/dashboard_client_implementation_x.cpp

@urfeex urfeex left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a test setting timeouts to a couple of microseconds and validating that the calls then fail would be good. Apart from that, I think we're good to go.

Comment on lines +126 to +129
// If the caller has explicitly configured a receive timeout via setReceiveTimeout,
// return that. Otherwise fall back to the constructor default. See the constructor
// comment for the rationale on the 10 s default (covers brake_release / program
// load/upload/download without per-method timeouts).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If the caller has explicitly configured a receive timeout via setReceiveTimeout,
// return that. Otherwise fall back to the constructor default. See the constructor
// comment for the rationale on the 10 s default (covers brake_release / program
// load/upload/download without per-method timeouts).


timeval DashboardClientImplX::getConfiguredSendTimeout() const
{
// Mirrors getConfiguredReceiveTimeout. Default of 10 s matches the constructor.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Mirrors getConfiguredReceiveTimeout. Default of 10 s matches the constructor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants