-
Notifications
You must be signed in to change notification settings - Fork 199
shelldriver: fixes and improvements #945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
e57b1d8
1209a65
e228dce
a652443
65d0269
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -110,6 +110,35 @@ def _run(self, cmd, *, timeout=30.0, codec="utf-8", decodeerrors="strict"): | |
def run(self, cmd, timeout=30.0, codec="utf-8", decodeerrors="strict"): | ||
return self._run(cmd, timeout=timeout, codec=codec, decodeerrors=decodeerrors) | ||
|
||
def _raw_run(self, cmd, timeout=2, codec="utf-8", decodeerrors="strict"): | ||
"""Runs command without run() wrapper as it's not yet available""" | ||
marker = gen_marker() | ||
self.console.sendline(f"echo {marker[:4]}; {cmd} ; echo {marker[4:]}") | ||
_, _, match, _ = self.console.expect( | ||
rf"{marker[:4]}(.*){marker[4:]}\s+{self.prompt}", | ||
timeout=timeout | ||
) | ||
data = self.re_vt100.sub('', match.group(1).decode(codec, decodeerrors)).split('\r\n') | ||
if data and not data[-1]: | ||
del data[-1] | ||
return data | ||
|
||
def _silence_kernel(self, timeout, kernel_log_level=1, codec="utf-8", decodeerrors="strict"): | ||
"""Tries to set kernel log level to KERN_ALERT log level to make it less chatty.""" | ||
|
||
while True: | ||
self.console.sendline(f"dmesg -n {kernel_log_level}") | ||
try: | ||
data = self._raw_run("cat /proc/sys/kernel/printk") | ||
if data[2] and data[2].startswith(f'{kernel_log_level}\t'): | ||
return | ||
|
||
except TIMEOUT: | ||
pass | ||
|
||
if timeout.expired: | ||
raise TIMEOUT(f"Timeout of {self.login_timeout} seconds exceeded during waiting for kernel silencer") # pylint: disable=line-too-long | ||
|
||
Comment on lines
+113
to
+141
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure we want more retry code for certain devices behaving strangely. Shouldn't these devices wait until input can be handled reliably before starting login/shell? From the commit message:
Could There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ok, fair enough.
Sure, if you have complete control over the build/settings of bootloader, kernel, drivers and operating system, then this is certainly doable. For example, in OpenWrt project I'm testing 3 releases, this means 3-4 different kernel versions, 5.10, 5.15, 6.1 and 6.6. I've already learned during the years, that each of this kernel versions changed the behavior of the device, driver probing/loading and thus influence the serial console behavior. I don't want to handle this differences manually, thus I'm silencing the kernel ASAP, then continue bringing the device into SSH over LAN network and profit. On some slow devices there might be 10s silence on the console and then ath11k is being probed and if you're lucky enough, exactly at the point when you're trying to detect LAN network availability of the device. In some builds, there are strange 3rd party applications which are not afraid to output debug logging into
Its quite some time, but I guess, that I've tried all available features/config options, indeed. I'll rebase, retest and see how it goes today. |
||
@step() | ||
def _await_login(self): | ||
"""Awaits the login prompt and logs the user in""" | ||
|
@@ -136,14 +165,14 @@ def _await_login(self): | |
) | ||
|
||
if index == 0: | ||
if did_login and not did_silence_kernel: | ||
if not did_silence_kernel: | ||
ynezz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Silence the kernel and wait for another prompt | ||
self.console.sendline("dmesg -n 1") | ||
self._silence_kernel(timeout) | ||
did_silence_kernel = True | ||
else: | ||
# we got a prompt. no need for any further action to | ||
# activate this driver. | ||
self.status = 1 | ||
self._status = 1 | ||
Comment on lines
-146
to
+175
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's already fixed with 8bc7858. |
||
break | ||
|
||
elif index == 1: | ||
|
@@ -177,9 +206,10 @@ def _await_login(self): | |
if timeout.expired: | ||
raise TIMEOUT(f"Timeout of {self.login_timeout} seconds exceeded during waiting for login") # pylint: disable=line-too-long | ||
|
||
if self.post_login_settle_time > 0: | ||
self.console.settle(self.post_login_settle_time, timeout=timeout.remaining) | ||
ynezz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
if did_login: | ||
if self.post_login_settle_time > 0: | ||
self.console.settle(self.post_login_settle_time, timeout=timeout.remaining) | ||
self._check_prompt() | ||
|
||
@step() | ||
|
@@ -206,11 +236,24 @@ def _check_prompt(self): | |
self._status = 0 | ||
raise | ||
|
||
def _inject_run(self): | ||
self.console.sendline( | ||
'''run() { echo -n "$MARKER"; sh -c "$@"; echo "$MARKER $?"; }''' | ||
) | ||
self.console.expect(self.prompt) | ||
def _inject_run(self, inject_timeout=10.0): | ||
"""Tries to inject run() command wrapper into shell on the console.""" | ||
timeout = Timeout(inject_timeout) | ||
|
||
while True: | ||
self.console.sendline( | ||
'''run() { echo -n "$MARKER"; sh -c "$@"; echo "$MARKER $?"; }''' | ||
) | ||
self.console.expect(self.prompt) | ||
try: | ||
data = self._raw_run("type run") | ||
if data[2] and "function" in data[2]: | ||
return | ||
except TIMEOUT: | ||
pass | ||
|
||
if timeout.expired: | ||
raise TIMEOUT(f"Timeout of {inject_timeout} seconds exceeded during waiting for run injection") # pylint: disable=line-too-long | ||
Comment on lines
-209
to
+256
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If your serial console does not work reliably (even after specifying a certain settle time), I don't think retrying is the correct approach. Even if this works after the nth try, too many other things will fail in unexpected ways later. Better make your serial console work reliably or prefer the SSHDriver in these situations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ok, fair enough, but my goal is to make it generic and reusable across different use cases, instead of fiddling with variable settle time, fighting crazy applications logging into
Well, this approach works reliably for over than 2 years, with a mix of dozen different devices, bootloaders, kernels/drivers and operating systems. So the idea here is to get as fast (and reliably) as possible into the SSH and profit.
Indeed, its the only reliable solution if there is no dedicated serial console available. |
||
|
||
@step(args=['keyfile_path']) | ||
def _put_ssh_key(self, keyfile_path): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have changed in 6743c69