Skip to content

Conversation

@dljsjr
Copy link
Member

@dljsjr dljsjr commented Apr 15, 2025

Type of Change

  • New feature

Checklist

  • I have performed a self-review of my code
  • I have commented my code in hard-to-understand areas
  • I have verified my changes by testing with a device or have communicated a plan for testing (see note)
  • I am adding new behavior, such as adding a sub-driver, and have added and run new unit tests to cover the new behavior (see note)

Description of Change

This change adds preliminary support for the Sonos OAuth requirements that will take effect in August, allowing us to integrate with the new Hub functionality in the upcoming release that will allow for the Sonos driver to get an OAuth token from the mobile app.

Note

There are a few small TODO's here based on the shape of errors. We didn't have the ability to readily trigger the failure case for this, and we need to be able to do that to finish those TODO's.

We have a way forward on this now, though, so we'll be able to fill those in shortly.

@github-actions
Copy link

github-actions bot commented Apr 15, 2025

Channel deleted.

@github-actions
Copy link

github-actions bot commented Apr 15, 2025

Test Results

   67 files    440 suites   0s ⏱️
2 246 tests 2 246 ✅ 0 💤 0 ❌
3 838 runs  3 838 ✅ 0 💤 0 ❌

Results for commit 346c4e5.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Apr 15, 2025

Minimum allowed coverage is 90%

Generated by 🐒 cobertura-action against 346c4e5

@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch from 90f2310 to 3e52f1a Compare April 15, 2025 23:01
end

if maybe_token then
wss_msg_header.authorization = string.format("Bearer %s", maybe_token.access_token)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the WSS connection, the Bearer token is part of the JSON payload's "header", not the HTTP headers. And the authorization key must be lowercase, according to the docs.

local start_success = sonos_conn:start()
if start_success then return end
if sonos_conn.driver.waiting_for_token and token_receive_handle then
local token, channel_error = token_receive_handle:receive()
Copy link
Member Author

@dljsjr dljsjr Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that we actually want to handle asking for the token in a loop; that said, I believe that this won't actually loop.

get_oauth_token() calls through to security.get_sonos_oauth_token(), which we have designed to be completely out-of-band async. So we fire-and-forget, with the return value either being nothing or an error due to perms/API compat.

The token_receive_handle is half of a cosock channel. We haven't put a timeout on it, and it's on a non-main thread, so it should just yield forever until it gets something. And that "something" should only come through if the send half of the channel gets dropped, or if the token arrives on the environment info update in the augmented store.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to think about this, but I don't know if no timeout is truly what we want. I could see a very large timeout, but at the same time I understand the potential load caused by thousands of hubs repeatedly checking for a token when the user doesn't link their account.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should have a timeout on this receive to avoid locking up entirely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back to this, I agree with Carter, I think a timeout -> retry is a better pattern than block forever.


self:refresh_subscriptions(reply_tx)

local reply = reply_rx:receive()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the docs, the WSS connection itself won't actually be refused if we're missing credentials. Instead, the first command we try to send (which is a subscribe command in our case) will let us know if the auth fails.

So we've added this reply channel API to be able to make the initial subscription establishment in the :start() method wait until we know that the subscription is allowed.

end)
end

local startup_state_received = false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want all device activity to be more or less frozen until we get the start up state/initial contents of the augmented data store, so we have to queue the devices up.

if not startup_state_received then
log.warn(string.format("startup state not yet received, delaying init for %s", (device and device.label or "<unknown device>")))
if device and device.id then
devices_waiting_for_startup_state[device.id] = device
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's said queue.

get_oauth_token_receive_handle = function(self)
oauth_token_tx:subscribe()
end,
get_oauth_token = function(self)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_oauth_token will only return a token if we actually have one. Otherwise, it'll trigger the OAuth activity in the hub firwmare, and immediately return with an error indicating why.

@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch from 3e52f1a to 49cb2c5 Compare April 15, 2025 23:17
if decode_success then
self.oauth.token = decoded
self.waiting_for_token = false
self.oauth_token_tx:send(decoded)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where token receipt actually happens. Notify anyone waiting for one.

-- a device is fully initialized whether we come from fresh join or restart.
-- See: https://smartthings.atlassian.net/browse/CHAD-9683
local function _initialize_device(driver, device)
if not startup_state_received then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to do an API version check. Otherwise on old hubs this will be broken. Those old integrations will fail come august (unless the person enabled unsecure integrations from sonos), but I think we should be able to handle the driver running on an older version of the lua libs.

@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch from 9a22827 to 5fe0c4f Compare April 30, 2025 21:54
Copy link
Contributor

@cjswedes cjswedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did my best reviewing this, but it was a difficult review given the refactor and new functionality all in the same change set. I think it would probably be helpful for me to get some high level descriptions of the cosock tasks that are running and their lifecycles.

-- token has not expired yet
if now < expiration then
-- token is expiring soon, so we pre-emptively refresh
if math.abs(expiration - now) < ONE_HOUR_IN_SECONDS then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For how long do the tokens are the tokens valid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

24 hours.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just continue with the normal error patterns that we use and are pretty universal to Lua? Although the Result type makes sense for rust devs, it will confuse most of our community, thus it seems like extra memory for no clear functionality gain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does look like its only used in the SSDP task, which in itself is difficult for anyone but a cosock maintainer to understand. Im still not sure what extra functionality it provides though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal was to be able to transmit errors to other locations easier, since nil handling can be ambiguous.

Comment on lines -25 to -35
-- SSDP Messages use the HTTP/1.1 Header Field rules described in RFC 2616, 4.2: https://datatracker.ietf.org/doc/html/rfc2616#section-4.2
-- This pattern extracts the Key/Value pairs in to a Lua table via the two capture groups.
-- The key capture group is composed entirely of a negating matcher to exclude illegal characters, ending at the `:`.
-- The RFC states that after the colon there may be any arbitrary amount of leading space between the colon
-- and the value, and that the value shouldn't have any trailing whitespace, so we exclude those as well.
-- The original Luncheon implementation of this Lua Pattern used iteration and detected the `;` separator
-- that indicates key/value parameters, however, we don't make that distinction here and instead leave parsing
-- values with parameters to the consumers of the output of this function.
local k, v = string.match(l, '([^%c()<>@,;:\\"/%[%]?={} \t]+):%s*(.-)%s*$')
if k == nil or k == "" then
return nil, string.format("Couldn't parse header/value pair for line %q", l)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this handled now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW this pattern is slightly different than what luncheon is doing for header kv extraction with a regex:
https://github.ecodesamsung.com/iot-hub/scripting-engine/blob/9cbe65c375a2dda2b1b7d4c1ab1beb0a1d21e061/lua_libs/luncheon/headers.lua#L79-L91

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one in Luncheon is "more correct". I wrote that one after I wrote this one. Figured it was about time to cut over.

string.format("setwaker: SSDP search can only wake on receive readiness, got unsupported wake kind: %s.", kind))

assert(self.waker_ref == nil or waker == nil,
"Waker already set, cannot await SSDP serach instance from multiple locations.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
"Waker already set, cannot await SSDP serach instance from multiple locations.")
"Waker already set, cannot await SSDP search instance from multiple locations.")

if body.errorCode == "ERROR_NOT_AUTHORIZED" then
local household_id, player_id = driver.sonos:get_player_for_device(device)
device.log.warn(string.format("WebSocket connection no longer authorized, disconnecting"))
local security_result, security_err = driver:request_oauth_token()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reconnect loop will do this too, so it seems like it isn't needed here.

@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch 2 times, most recently from c5d950c to f2db230 Compare May 27, 2025 13:45
@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch 2 times, most recently from 10e4b21 to a92e9f3 Compare May 27, 2025 22:15
dljsjr added 5 commits May 27, 2025 17:22
- Move `SonosState` to its own package
- Move `SonosDriver` to its own package
- Cleanup type hints/doc comments
- Vendor incubating cosock `bus` and `stream` impls to deal with
  possible bugs
- Update Lunchbox to latest vendored version with fixes from Hue
- Add new `result` package that for handling result objects to avoid nil
  ambiguity during propagation of errors
- Move a bunch of global state in to the REST API Modules
- Allow for more than one API Key to be injected
- Factor out HTTP Header creation to a function
- Refactor player "addressing" in the Websocket Router and Connection
  abstraction
The previous Sonos state handling was massively spaghettified.

This isn't much better... but it *is* better.
@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch from a92e9f3 to 9da491c Compare May 27, 2025 22:22
if success then
return
local auth_success, api_key_or_err = driver:check_auth(info)
if not auth_success then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird to have these 2 checks one after another one a "not" and the other == false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it's structured that way is because either the auth check can fail, which is a false, or the check_auth operation can error, which returns nil, error. In both cases, we want to call device:offline(), but in the failed-but-not-error case, we want to do the additional work in the == false branch.


function SonosDriver:handle_startup_state_received()
self:start_ssdp_event_task()
self:notify_augmented_data_changed "snapshot"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't particularly like this syntax, and we don't use it elsewhere so I think we should change it to a more standard function call syntax

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The neovim version of luals uses a different formatter than the VS Code one, and it seems to prefer this for some reason. I tried to clean up all the places it does this.

@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch from 9da491c to 346c4e5 Compare May 28, 2025 02:10
@dljsjr
Copy link
Member Author

dljsjr commented May 28, 2025

With approval from @varzac:

There are QA routes that require the driver to be on a SmartThings owned channel for testing to move forward. Alpha suffices for this, so we're going to merge this now, but I'll continue monitoring for further review and make changes/fixes before we go to Beta.

@dljsjr dljsjr merged commit f33f6dd into main May 28, 2025
11 checks passed
@dljsjr dljsjr deleted the feat/sonos-oauth-support branch May 28, 2025 02:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants