Skip to content

Conversation

@denisonbarbosa
Copy link
Member

We've been needing those for a while now. Although they're not fully stable and automated yet, we should add the tests here to follow up with the other tasks and to discuss implementation details.

More details are explained in the individual commits.

UDENG-7424

@codecov-commenter
Copy link

codecov-commenter commented Sep 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.33%. Comparing base (868dd8a) to head (1e6b111).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #659   +/-   ##
=======================================
  Coverage   78.33%   78.33%           
=======================================
  Files          22       22           
  Lines        1274     1274           
=======================================
  Hits          998      998           
  Misses        276      276           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

browser.send_key_taps([Gdk.KEY_Return])
browser.wait_for_stable_page()

sleep(5)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wait here or in the caller?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are sleeping at both... The sleep here is to make sure we wait for the request to fully finish before exiting it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's acceptable... But... We should never sleep this way in UI applications, but rely instead on the main loop, so can do instead something like:

def wait_for_timeout(timeout):
    loop = GLib.MainLoop()

    def on_timeout():
        loop.quit()
        return False

    GLib.timeout_add(timeout, on_timeout)
    loop.run()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like that is what the wait_for_stable_page function was supposed to be doing. Maybe we have a logic flaw somewhere?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_for_stable_page is only waiting if there are no page updates, we can maybe tune the timeout though.

${text} = PlatformVideoInput.Read Text
${user_code} = StringUtils.FirstMatch (https://)?microsoft.com/devicelogin\n((Login code: )?([A-Z0-9]+)) ${text}

Start Process ${ENTRAID_COMMON}/BrowserWindow.py ${username}@${domain} %{E2E_PASSWORD} ${user_code} alias=RemoteAuth
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to run it with RUNNING_OFFSCREEN or we prefer to see the UI to record it, just in case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes it bugs out and types the password in the incorrect field (I couldn't understand why exactly), but it's something that we definitely do not want to record

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh, I see... One thing I wanted to add is the ability to save the page content on failure and we can do quite easily, but it was just a quick hack at the time :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would definitely be nice to have. I used the script pretty much as you wrote it, so we can definitely improve some stuff there (although I don't really have experience with the GTK flow)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this open, and see if I can handle it :)

Copy link
Collaborator

@3v1n0 3v1n0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff, I hope it's quite stable but we have a great base now

browser.send_key_taps([Gdk.KEY_Return])
browser.wait_for_stable_page()

sys.exit(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed

Comment on lines 77 to 101
# Run Command In GNOME Terminal
# [Arguments] ${command}
# ${ampersand} Create List Shift_L 7
# Hid.Type String clear
# Hid.Keys Combo space
# Hid.Keys Combo ${ampersand}
# Hid.Keys Combo ${ampersand}
# Hid.Keys Combo space
# Hid.Type String ${command}
# Hid.Keys Combo space
# Hid.Keys Combo ${ampersand}
# Hid.Keys Combo ${ampersand}
# Hid.Keys Combo space
# Hid.Type String clear
# Hid.Keys Combo space
# Hid.Keys Combo ${ampersand}
# Hid.Keys Combo ${ampersand}
# Hid.Keys Combo space
# Hid.Type String echo cmd-finished
# # Sleep a bit to ensure everything gets typed properly
# BuiltIn.Sleep 2 # hacky
# Hid.Keys Combo Return
# # Sleep a bit to avoid matching the text on the command itself
# BuiltIn.Sleep 2 # hacky
# Match Text cmd-finished 600
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to keep this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, forgot to remove it

Comment on lines 107 to 108
BuiltIn.Sleep 2
Match Text cmd-finished 600
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't sleep already part of matching (600)? Why sleeping more?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really... We need the sleep to make sure we don't match the "cmd-finished" of the command itself

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But didn't we clear it? I'd say we should do something command & clear && wait && echo cmd-finished, so extra sleep should not be needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do clear it, but there's a sync issue between the commands. If the VM command does not run fast enough and YARF progresses to the next instruction, it can match the text before the screen gets updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, BTW please check with the new syntax and let's see if we can get such race, in case we can add the sleep back (even less than a a second though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, there isn't any other way to avoid this issue. It's the same thing that's done in the installer tests, if I'm not mistaken.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

@denisonbarbosa denisonbarbosa requested a review from 3v1n0 October 15, 2025 12:26
Comment on lines 22 to 34
if os.getenv("RUN_ONSCREEN") is None and "RUNNING_OFFSCREEN" not in os.environ:
os.execv(
"/usr/bin/env",
[
"/usr/bin/env",
"RUNNING_OFFSCREEN=1",
"GDK_BACKEND=x11",
"xvfb-run",
"-a",
sys.executable,
]
+ sys.argv,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to an utility function too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could, but the code spawns a subprocess on itself, so I think it's good to have a "in your face" approach with it, wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was mostly for avoiding to copy this for each test case we'll have (vanilla broker)

Comment on lines 36 to 42
if len(sys.argv) < 4:
print("Usage: BrowserWindow.py <username> <password> <code>")
sys.exit(1)

username = sys.argv[1]
password = sys.argv[2]
device_code = sys.argv[3]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but I feel you can have this with 4 lines of argparse

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, but it's not worth adding the dependency for something as simple as this, especially for a script that is not intended to be run by people.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI argparse is part of the standard library

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it definitely shows that it's been a while since I've written Python 😂

# Disable automatic updates and remove some unnecessary packages
- apt-get remove -y update-manager gnome-initial-setup
# We don't need Firefox for the tests, so we can remove it to save some space
- snap remove firefox --purge
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel you can save even more by dropping the whole gnome-* runtimes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, IMO. The cached image is already small enough. If we need more space in the future, we could that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check it out, since these are quite big eh:

-rw------- 2 root root 517M set 27 02:27 /var/lib/snapd/snaps/gnome-42-2204_226.snap
-rw------- 2 root root 619M set  9 10:23 /var/lib/snapd/snaps/gnome-46-2404_125.snap

# We don't need Firefox for the tests, so we can remove it to save some space
- snap remove firefox --purge
# Configure SSH to allow interactive authentication
- sed -i 's/KbdInteractiveAuthentication no/KbdInteractiveAuthentication yes/' /etc/ssh/sshd_config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I'll update it (didn't even know we had a guide for cloud-init deployment)

Hid.Keys Combo Control_L Alt_L t
Match Text @ubuntu:~$ 15
Open Terminal
Run Command /usr/bin/x-terminal-emulator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can drop /usr/bin and maybe call xdg-terminal, since we are not tergetting noble anyways here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we are not targeting Noble (yet, at least), I think it's better to keep the tests as release-independent as possible. If we change that, we would need to have multiple versions of the resource files (and although this might eventually become the case in the future, we should avoid it for now)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you prefer, the /us/bin/ drop still stands :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@adombeck adombeck force-pushed the msentraid-e2e-tests branch from 5064490 to 42d9652 Compare October 24, 2025 11:59
Allows to debug the browser login via the webview. Replaces the
functionality of the RUN_ONSCREEN environment variable.
We were frequently seeing errors like the password being pasted into the
username field. Lets try to make this more robust by not only waiting
for the page to stop loading but also for specific text to be visible.
@adombeck adombeck force-pushed the msentraid-e2e-tests branch from 42d9652 to e6d0d4b Compare October 24, 2025 12:02


Update System Time
Run Command In Terminal sudo service chronyd restart && sudo chronyc waitsync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is chrony even installed? It's not installed by default on Ubuntu AFAICT, and I don't see it being installed explicitly, so if it installed I wonder which package pulls it in as a dependency.

Can't we just use systemd-timesyncd?

Suggested change
Run Command In Terminal sudo service chronyd restart && sudo chronyc waitsync
Run Command In Terminal sudo systemctl restart systemd-timesyncd.service

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an Ubuntu server image, so it comes with chrony rather than systemd-timesyncd and they are mutually exclusive

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I didn't know that. That seems like the right approach then. Let's just use the systemd executable instead of the sys V init one to restart the service:

Suggested change
Run Command In Terminal sudo service chronyd restart && sudo chronyc waitsync
Run Command In Terminal sudo systemctl restart chronyd && sudo chronyc waitsync

def screenshot_window(window: Gtk.Window, filename: str):
# Get widget allocation (size)
alloc = window.get_allocation()
width, height = alloc.width, alloc.height
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can avoid these repetitions

Comment on lines 78 to 79
# Get widget allocation (size)
alloc = window.get_allocation()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment here is unneeded, it's more than clear.

However, maybe we can assert that window.is_drawable() and window.get_mapped()

screenshot_window(browser, os.path.join(screenshot_dir, "05-device-login-success.png"))


def screenshot_window(window: Gtk.Window, filename: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call it: save_window_snapshot

Comment on lines 25 to 39
if len(sys.argv) < 4:
print(f"Usage: {sys.argv[0]} <username> <password> <code> [<output-dir>]")
sys.exit(1)

username = sys.argv[1]
password = sys.argv[2]
device_code = sys.argv[3]
output_dir = sys.argv[4] if len(sys.argv) > 4 else "."
screenshot_dir = os.path.join(output_dir, "webview")
os.makedirs(screenshot_dir, exist_ok=True)

try:
login(username, password, device_code, screenshot_dir)
finally:
write_video(screenshot_dir, os.path.join(output_dir, "webview_recording.webm"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(sys.argv) < 4:
print(f"Usage: {sys.argv[0]} <username> <password> <code> [<output-dir>]")
sys.exit(1)
username = sys.argv[1]
password = sys.argv[2]
device_code = sys.argv[3]
output_dir = sys.argv[4] if len(sys.argv) > 4 else "."
screenshot_dir = os.path.join(output_dir, "webview")
os.makedirs(screenshot_dir, exist_ok=True)
try:
login(username, password, device_code, screenshot_dir)
finally:
write_video(screenshot_dir, os.path.join(output_dir, "webview_recording.webm"))
parser = argparse.ArgumentParser()
parser.add_argument("username")
parser.add_argument("password")
parser.add_argument("device_code")
parser.add_argument("--output-dir", required=False, default=os.path.realpath(os.curdir))
args = parser.parse_args()
screenshot_dir = os.path.join(args.output_dir, "webview")
os.makedirs(screenshot_dir, exist_ok=True)
try:
login(args.username, args.password, args.device_code, screenshot_dir)
finally:
write_video(screenshot_dir, os.path.join(args.output_dir, "webview_recording.webm"))

Comment on lines 93 to 101
def write_video(screenshot_dir: str, video_path: str):
subprocess.check_call([
"ffmpeg",
"-y",
"-framerate", "1",
"-pattern_type", "glob",
"-i", f"{screenshot_dir}/*.png",
video_path,
])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is taking the snapshots and could be fine for some basic debugging, it doesn't catch potential problems IMHO as it relies on us taking snapshots.

So, you can instead just call save_window_snapshot on window draw signal, so that you can write all the frames (or a subset of them, maybe one once second is fine), and you'll get a proper video

def on_timeout():
nonlocal poll_id
if poll_id:
GLib.source_remove(poll_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GLib.source_remove(poll_id)
GLib.source_remove(poll_id)
poll_id = 0

Comment on lines 114 to 115
poll_id = None
timeout_id = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're int values

Suggested change
poll_id = None
timeout_id = None
poll_id = 0
timeout_id = 0

"document.body.innerText.indexOf(%s) !== -1)"
) % json.dumps(text)
self.web_view.run_javascript(js, None, on_js_finished, None)
return True # keep polling until callback quits the loop
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing this here, I'd return False here and instead restart the polling (calling poll_id = GLib.timeout_add(poll_interval_ms, poll_fn) after the finish function has returned, so that there's no possibility that we're polling twice while the same async function has not returned yet.

GLib.source_remove(timeout_id)
if poll_id:
GLib.source_remove(poll_id)
loop.quit()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
loop.quit()
def on_js_finished(web_view, result, user_data):
nonlocal poll_id, timeout_id, found
try:
res = web_view.run_javascript_finish(result)
js_value = res.get_js_value()
found = bool(js_value.to_boolean())
except Exception as e:
print(f"Failed checking page JS: {e}", file=sys.stderr)
pass
if found:
if timeout_id:
GLib.source_remove(timeout_id)
timeout_id = 0
GLib.source_remove(poll_id)
poll_id = 0
loop.quit()


browser.web_view.load_uri("https://microsoft.com/devicelogin")

browser.wait_for_text_visible("Enter code to allow access")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should force the browser to use english language, otherwise all these are hard to test locally, so we should also set locale.setlocale(locale.LC_ALL, "C")

3v1n0 added 15 commits October 25, 2025 00:35
So running the browser in any setup works with the the website defaults
without making them translate
Avoid having to track the browser screenshots indexes manually
While here we should ideally just look for visible items, this is fine
for now.

However, we can make the check oneline and meanwhile we can ensure that
the value we're looking is properly parsed, so dump it as JS and parse
it back as such
We were triggering a warning there for this reason, so fix that
The main goal of this commit was to ensure that the poll function only
runs *after* the previous request has actually been completed, and this
was not the case with the current code that was polling every few
milliseconds, but without being sure that the previous JS request was
actually completed.

So now, inject the JS code after a small timeout, if that fails for an
error we just throw (exiting the loop), otherwise:
 - if the JS returns a true value we can quit the loop
 - if the JS was evaluated to false, then we can retry, again after
   that the poll timeout is over.

Use a cancellable to control the async function cancellation, as we need
to ensure that the underlying thread is stopped when cancelling, but
also we need to stop the loop and other cleanup actions
This is a generic feature that does not depend on the actual test, so
move it the main window implementation
In case of failures it's when the snapshots are important, so let's
record one during that phase
Gdk provides an optimized way to create image surfaces, so use it when
possible, although that's not always the case as pygobject bindings are
broken in older versions, so add a fallback in case.

In any case, respect the surface scaling
Use the GTask thread pools to perform the IO events on threads rather
than saving in the main thread
Draw events in gtk3 are special, since it's where widgets are also
actually drawing into the surfaces, so we should use them with care and
avoid to slow them down.

We've now also a problem with it since we're explicitly drawing the
window to take snapshots, but this triggers the "draw" signal too, and
it creates problem when waiting the page stabilization

Since we rely on it to check on UI changes, and we may need to use it in
future, it's better to abstract the signal connection so that:
 - We connect only once to it
 - Multiple handlers can be added at runtime
 - Handlers are called in idle functions instead
 - No handler is called when explicitly drawing the window for
   creating a snapshot
@3v1n0 3v1n0 force-pushed the msentraid-e2e-tests branch 2 times, most recently from fce6f2f to 8ac800c Compare October 25, 2025 05:05
3v1n0 added 3 commits October 25, 2025 07:13
Support recording the window fully instead on request only so that the
video we produce includes the whole session
While show it only if a specific option is passed
@3v1n0 3v1n0 force-pushed the msentraid-e2e-tests branch from 8ac800c to 1e6b111 Compare October 25, 2025 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants