Skip to content

Fix gateway bind error handling#137

Merged
rexlunae merged 14 commits intomainfrom
fix/gateway-bind-error-handling
Apr 5, 2026
Merged

Fix gateway bind error handling#137
rexlunae merged 14 commits intomainfrom
fix/gateway-bind-error-handling

Conversation

@rexlunae
Copy link
Copy Markdown
Owner

@rexlunae rexlunae commented Mar 23, 2026

  • Merge origin/main to get latest gateway transport refactoring
  • Replace transport.rs with main's version (PeerInfo, Transport traits)
  • Use origin/main's ssh.rs (StdioTransport, SshConfig, SshTransport)
  • Fix gateway/mod.rs: remove redundant test-bind block; keep success log after TcpListener::bind
  • Export StdioTransport from gateway module for CLI use
  • Update transport re-exports to use new PeerInfo/TransportWriter types
  • Fix desktop: upgrade dioxus 0.6→0.7 to match dioxus-bulma 0.7.2
  • Fix desktop: use BulmaColor:: instead of Color::
  • Fix desktop: use BulmaSize::Small instead of Size::Small
  • Fix desktop: fix is_collapsed borrow issue in tool_call.rs
  • Fix desktop/client.rs: use correct ClientFrame/ClientPayload variants
  • Fix ssh.rs: add missing debug macro import (--features full)
  • Fix ssh.rs: add missing Path type import (--features full)
  • Fix ssh.rs: add missing TransportAcceptor trait import (--features full)
  • Fix desktop/client.rs: connect_async(&url) → connect_async(url.as_str())
  • Fix state.rs: add PartialEq to ChatMessage, ToolCallInfo, ThreadInfo
  • Fix pairing.rs: Field { label: "..." } → FieldLabel { "..." }
  • Fix pairing.rs: PngEncoder::encode() → write_image() for image 0.25
  • Fix hatching.rs: pre-compute step/disabled values to fix borrow conflicts
  • Fix chat.rs: remove invalid Field_addons, use Field { addons: true } + Control { expanded: true }
  • Fix desktop/client.rs: ThreadCreate label Option → String conversion
  • Fix desktop/client.rs: ThreadInfoDto label/status type conversions
  • Fix components/mod.rs: export HatchingResult for type inference in app.rs
  • Fix E0603: re-export GatewayCommand from gateway/mod.rs (was private)
  • Fix E0603: re-export generate_qr_code from components/mod.rs (was private)
  • Fix E0282: add explicit HatchingResult type annotation in app.rs closure
  • Fix E0382: clone props.public_key before handle_copy closure in pairing.rs
  • Fix future-not-Send: use dioxus spawn() instead of tokio::spawn() in pairing.rs
  • CI: add libgtk-3-dev/libwebkit2gtk-4.1-dev install step to integration-tests and e2e-tests jobs
  • CI: add libxdo-dev to integration-tests and e2e-tests jobs (needed for linking rustyclaw-desktop)

@rexlunae rexlunae self-assigned this Mar 23, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment thread crates/rustyclaw-core/src/gateway/mod.rs Outdated
Comment on lines +44 to +45
let client_guard = client.lock().await;
if let Some(event) = client_guard.recv().await {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Outer mutex held during blocking recv() causes deadlock when sending commands

The event loop in app.rs:44 acquires the outer Arc<Mutex<GatewayClient>> lock and then calls client_guard.recv().await, which blocks indefinitely waiting for the next server event while holding the outer lock. When the user submits a message (app.rs:66), on_submit tries to acquire the same outer lock to call chat(), but it is blocked because the event loop holds it. Since the server won't send events until a client message arrives, this creates a deadlock: the event loop waits for a server event, the submit handler waits for the lock, and the server waits for a client message. The send() and recv() methods on GatewayClient use independent internal channels (cmd_tx vs event_rx) and don't need the outer mutex — the fix would be to either split the client into separate sender/receiver handles or remove the outer Mutex.

Prompt for agents
The root cause is that GatewayClient is wrapped in Arc<Mutex<...>> (app.rs:19), but the event loop holds this mutex lock during the blocking recv() call (app.rs:44-45), preventing on_submit (app.rs:66) from acquiring it to send commands.

Fix approach: Refactor GatewayClient so that send() and recv() don't require &mut self or exclusive access. Since internally they already use separate channels (cmd_tx is an mpsc::Sender which is Clone, and event_rx is already behind its own Arc<Mutex>), you can:

1. In crates/rustyclaw-desktop/src/gateway/client.rs, make cmd_tx a tokio::sync::mpsc::Sender (already is) and remove the outer Mutex<GatewayClient> requirement.
2. In crates/rustyclaw-desktop/src/app.rs line 19, change the type from Signal<Option<Arc<Mutex<GatewayClient>>>> to Signal<Option<Arc<GatewayClient>>>.
3. In the event loop (app.rs:42-51), call client.recv().await directly without locking.
4. In on_submit (app.rs:64-71) and other handlers, call client.chat(message).await directly without locking.
5. This works because send() uses cmd_tx (cloneable Sender) and recv() uses event_rx (already behind its own internal Mutex).
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

rexlunae and others added 2 commits March 23, 2026 05:28
- Add rustyclaw-desktop to workspace
- Dioxus + dioxus-bulma UI framework
- Gateway WebSocket client with bincode/JSON support
- Components: Chat, Sidebar, MessageBubble, ToolCallPanel
- Hatching dialog for first-run identity setup
- Pairing dialog with QR code generation
- App state management (messages, threads, connection status)

Build requires GTK dev libraries:
  Debian/Ubuntu: libgtk-3-dev libwebkit2gtk-4.0-dev
  Fedora: gtk3-devel webkit2gtk3-devel

Part of v0.4.0 roadmap: desktop client initiative.
- Test bind early before logging success message
- Return proper error instead of continuing with broken state
- Prevents misleading 'Gateway listening' logs when bind fails

Fixes #136
@rexlunae rexlunae force-pushed the fix/gateway-bind-error-handling branch from be8c8f9 to 09566ed Compare March 23, 2026 05:29
devin-ai-integration[bot]

This comment was marked as resolved.

- Exclude rustyclaw-desktop from cross-compilation targets (ARM/ARMv7)
- Install GTK system dependencies for native Linux builds
- Cross-compilation can't easily handle GTK/webkit system libraries
- Desktop client still builds fine on native Linux, macOS, Windows

Fixes CI failures with glib-2.0 missing on ARM cross-compilation.
--exclude only works with --workspace flag, but cross doesn't use workspace by default.
Use explicit --package flags instead to build only core packages on cross-compilation targets.

This should resolve the 'error: --exclude can only be used together with --workspace' issue.
Use 'matrix.use_cross == false' instead of '!matrix.use_cross'
which is invalid YAML syntax in GitHub Actions.
Use libwebkit2gtk-4.1-dev instead of libwebkit2gtk-4.0-dev
which doesn't exist in Ubuntu 24.04 (Noble).
Found another instance of libwebkit2gtk-4.0-dev that also needed
to be updated to libwebkit2gtk-4.1-dev for Ubuntu 24.04.
The CLI crate is named 'rustyclaw' not 'rustyclaw-cli' in Cargo.toml.
Fixed both test commands and cross-compilation check.
- Fix imports in gateway/mod.rs to match actual transport exports
- Remove non-existent SSH type exports
- Add missing uuid dependency to workspace and core
- Fix error type from RustyclawError to anyhow::Error
- These errors were introduced during merge conflict resolution

The gateway bind error fix logic is preserved and should work correctly.
devin-ai-integration[bot]

This comment was marked as resolved.

@rexlunae
Copy link
Copy Markdown
Owner Author

rexlunae commented Apr 2, 2026

@copilot Can you fix the CI errors and the merge conflicts?

Co-authored-by: rexlunae <6726134+rexlunae@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 2, 2026

@copilot Can you fix the CI errors and the merge conflicts?

Fixed in commit f8fdd00.

Merged origin/main and resolved the conflict in ssh.rs (used main's version). The CI errors were caused by:

  1. Missing StdioTransport export — added to gateway module re-exports
  2. Wrong Transport trait version — replaced transport.rs with main's version (PeerInfo/TransportWriter/WebSocketTransport design)
  3. Desktop dioxus version mismatchdioxus-bulma 0.7.2 requires dioxus 0.7; updated Cargo.toml from 0.6 to 0.7
  4. Wrong API names in desktop client — fixed Color::BulmaColor::, Size::SmallBulmaSize::Small, and corrected ClientFrameType/ClientPayload variant names (AuthAuthResponse, VaultUnlockUnlockVault, etc.)
  5. Borrow conflict in tool_call.rs — fixed is_collapsed mutable+immutable borrow
  6. Redundant test-bind block — removed the duplicate TcpListener::bind test since origin/main already handles bind failures at the initial bind site with proper error context

devin-ai-integration[bot]

This comment was marked as resolved.

@rexlunae
Copy link
Copy Markdown
Owner Author

rexlunae commented Apr 4, 2026

@copilot CI is failing please fix.

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 4, 2026

@copilot CI is failing please fix.

Fixed in commit feb315b. The CI failures were all in two areas:

rustyclaw-desktop crate (all platforms):

  • E0277: connect_async(&url)connect_async(url.as_str()) to avoid tungstenite version diamond conflict
  • E0369: Added PartialEq to ChatMessage, ToolCallInfo, and ThreadInfo in state.rs (required by dioxus Props derive)
  • E0599: Fixed Field { label: "..." }FieldLabel { "..." } (dioxus-bulma 0.7 API change)
  • E0599: Fixed PngEncoder::encode()write_image() (image 0.25 API change)
  • E0308: Fixed ThreadCreate { label } type mismatch (Option<String>String) and ThreadInfoDto field conversions
  • E0502: Fixed step borrow conflict in hatching.rs by pre-computing current_step/is_next_disabled before rsx!
  • E0282: Exported HatchingResult from components/mod.rs to fix type inference, replaced invalid Field_addons with Field { addons: true } + Control { expanded: true }

rustyclaw-core with --features full (ssh feature):

  • Added debug to tracing imports, use std::path::Path for #[cfg(feature = "ssh")] functions, and TransportAcceptor to the transport trait imports

devin-ai-integration[bot]

This comment was marked as resolved.

@rexlunae
Copy link
Copy Markdown
Owner Author

rexlunae commented Apr 4, 2026

@copilot Still failing, please fix.

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 4, 2026

@copilot Still failing, please fix.

Fixed in commit 73e60c7. The remaining failures were all in rustyclaw-desktop:

  • E0603 protocol private — re-exported GatewayCommand from gateway/mod.rs
  • E0603 pairing private — re-exported generate_qr_code from components/mod.rs; updated app.rs to use imported items instead of crate-path references
  • E0282 type annotation — added explicit HatchingResult type on the on_complete closure in app.rs
  • E0382 moved value — cloned props.public_key before the handle_copy closure in pairing.rs
  • future not Send — replaced tokio::spawn with dioxus spawn in pairing.rs (dioxus Signal is not Send)

@rexlunae
Copy link
Copy Markdown
Owner Author

rexlunae commented Apr 4, 2026

@copilot The integration tests are failing becase of this:
The system library glib-2.0 required by crate glib-sys was not found.
The file glib-2.0.pc needs to be installed and the PKG_CONFIG_PATH environment variable must contain its parent directory.
The PKG_CONFIG_PATH environment variable is not set.

HINT: if you have installed the library, try setting PKG_CONFIG_PATH to the directory containing glib-2.0.pc.

We may need to modify the test for this to work right. We'll need to be able to run the tests with the dependent libraries installed.

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 4, 2026

@copilot The integration tests are failing becase of this:
The system library glib-2.0 required by crate glib-sys was not found.
The file glib-2.0.pc needs to be installed and the PKG_CONFI...

Fixed in commit a0395ba. The integration-tests and e2e-tests jobs were missing the sudo apt-get install -y libgtk-3-dev libwebkit2gtk-4.1-dev step that the check and test jobs already had. Added it to both jobs so cargo build can resolve the glib-2.0/gio-2.0/gobject-2.0 system libraries required by the desktop crate.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 13 additional findings in Devin Review.

Open in Devin Review

Comment thread .github/workflows/ci.yml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 build-release CI job missing GTK/WebKit system dependencies for desktop crate

The build-release job for native Linux (x86_64-unknown-linux-gnu, use_cross: false) runs cargo build --release on the entire workspace. Since rustyclaw-desktop is now a workspace member requiring dioxus with desktop feature (which depends on GTK/WebKit), this build will fail because the job doesn't install libgtk-3-dev and libwebkit2gtk-4.1-dev. Other jobs (check, test, integration-tests, e2e-tests) correctly install these dependencies, but build-release and lint do not. The build-release job does not have continue-on-error, so it would block release artifact creation on tag pushes.

(Refers to lines 250-255)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +148 to +153
GatewayCommand::Chat { message } => ClientFrame {
frame_type: ClientFrameType::Chat,
payload: ClientPayload::Chat {
messages: vec![ChatMessage::text("user", &message)],
},
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Chat sends only the latest message instead of full conversation history

The command_to_frame function sends only vec![ChatMessage::text("user", &message)] — a single-element vector with just the latest user message. The gateway forwards these messages directly to the model provider (crates/rustyclaw-core/src/gateway/mod.rs:1304-1311). Without prior conversation history, the model has no context of the ongoing conversation, so every message is effectively a new standalone request.

The TUI client correctly maintains a conversation: Vec<ChatMessage> that accumulates all user and assistant messages, sending conversation.clone() with each request (crates/rustyclaw-tui/src/app/app.rs:444-451). The desktop client needs similar behavior — either accumulate messages locally or restructure the GatewayCommand::Chat to accept the full history.

Prompt for agents
The desktop client's command_to_frame function at crates/rustyclaw-desktop/src/gateway/client.rs:148-153 creates a Chat frame with only the latest user message. The gateway passes whatever messages it receives directly to the model provider, so the model loses all conversation context.

The TUI client (crates/rustyclaw-tui/src/app/app.rs:435-529) maintains a local conversation Vec<ChatMessage> that accumulates messages across the session, and clones the full history into each Chat frame.

To fix this:
1. Change GatewayCommand::Chat to carry a Vec<ChatMessage> instead of a single String
2. In app.rs on_submit handler, build the full message history from state.messages (converting the desktop ChatMessage structs to the core ChatMessage type) and pass it to the command
3. Also handle assistant responses: when GatewayEvent::ResponseDone arrives, the accumulated assistant text should be included in subsequent chat requests
4. The messages should include a system prompt as the first message, matching the TUI client behavior

Relevant files:
- crates/rustyclaw-desktop/src/gateway/protocol.rs (GatewayCommand::Chat definition)
- crates/rustyclaw-desktop/src/gateway/client.rs (command_to_frame function)
- crates/rustyclaw-desktop/src/app.rs (on_submit handler)
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@rexlunae
Copy link
Copy Markdown
Owner Author

rexlunae commented Apr 4, 2026

@copilot Two CI pipelines are still failing. If you need to modify the pipelines, you can, if you need to add feature gates to avoid causing errors, that's fine too.

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 4, 2026

@copilot Two CI pipelines are still failing. If you need to modify the pipelines, you can, if you need to add feature gates to avoid causing errors, that's fine too.

Fixed in commit 6c30aa7. Both failures were linker errors: rust-lld: error: unable to find library -lxdo. Added libxdo-dev to the apt-get install step in both integration-tests and e2e-tests jobs — this library is required when linking rustyclaw-desktop but wasn't needed for cargo check (which skips linking).

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 14 additional findings in Devin Review.

Open in Devin Review

Comment thread .github/workflows/ci.yml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 CI lint job missing system dependencies for desktop crate, breaking clippy

The lint job runs cargo clippy --all-targets --all-features which now includes the rustyclaw-desktop crate (added to the workspace in this PR). However, unlike the test, integration-tests, and e2e-tests jobs, the lint job doesn't install libgtk-3-dev and libwebkit2gtk-4.1-dev. The desktop crate's Dioxus dependency with desktop feature requires these libraries to compile, so clippy will always fail on this workspace member. While continue-on-error: true masks the failure, it means clippy results for all crates (not just desktop) are silently lost.

(Refers to lines 188-204)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@rexlunae rexlunae merged commit 2880bc5 into main Apr 5, 2026
17 checks passed
@rexlunae rexlunae deleted the fix/gateway-bind-error-handling branch April 5, 2026 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants