Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async interpreter #1698

Merged
merged 97 commits into from
Sep 5, 2024
Merged

Async interpreter #1698

merged 97 commits into from
Sep 5, 2024

Conversation

Tehforsch
Copy link
Contributor

@Tehforsch Tehforsch commented Aug 12, 2024

What:
This PR implements three major new features:

  • Adds a scanner type for the rust NASL interpreter to openvasd
  • Adds storage functionality for openvasd
  • Makes the NASL interpreter async.

Jira: SC-1071

Structure of Scanner Code

This PR adds the Scanner type in the nasl_interpreter crate. (It would be nicer to have this in a different crate but I don't want to add a new one. Maybe in the future if we ever merge some of the crates, this can be moved/renamed. ) The Scanner is the single instance managing all scans during a run with Openvasd scanner type. To do so, it starts a number of RunningScans, each of which is responsible for a single Scan.
The RunningScan is responsible for computing the execution plan for this particular Scan and running it to completion. It also takes care of controlling the status of the scan and stopping it if necessary. Internally, it runs code via the ScanRunner (sorry, I ran out of names) which is responsible for performing all VTs in all stages on each target host. It is also responsible for sticking to the scheduling requirements. I suspect that this part of the code is where we should be able to easily improve performance by making the scripts run concurrently.
Finally, for a given VT and a given Host, the VT is then run to completion using the VTRunner.

Tests

Since all of our NASL test employed the Iterator impl on CodeInterpreter which cannot exist anymore (since the CodeInterpreter is now async), they all needed to be changed. I took this as an opportunity to move the tests to a more unified structure, so that it will be easier to 1. add new tests and 2. change how the tests are done internally without having to make changes all over the code base.

Here is an example how the tests work now:

let mut t = TestBuilder::default();
t.run("l = [1,2,3,4,5];");
t.ok("max_index(l);", 5);

The TestBuilder struct can be used to set up the proper Context, Storage and function Loader and has a number of utility functions to perform common tests.
Lines of code can be added to the TestBuilder one-by-one, along with certain requirements on their results (For example, the t.ok("max_index(l)", 5) above will check that the result of the max_index(l) call is NaslValue::Number(5).). At the end of the test, the TestBuilder will automatically check that all of the requirements are fulfilled. Even though lines are added individually, the TestBuilder keeps track of the state of the Context during execution, so tests that require multiple lines to set up (like the one above) will still work as intended.

To test that the correct Err variant is returned in certain lines, the check_err_matches! macro can be used as before, by passing it the TestBuilder as an argument:

check_err_matches!(
    t,
    r#"typeof(23,76);"#,
    FunctionErrorKind::TrailingPositionalArguments { .. }
);

To quickly check individual lines, check_ok still exists.

Function executor

Previously, every set of NASL functions was added to the standard library by implementing the NaslFunctionExecutor trait for a struct (that might be either a marker ZST or a proper stateful struct). This approach was difficult to incorporate with the new async interpreter where NASL functions might be both sync or async, since async functions can't really be referred to by a simple type like fn(&Context, &Register) -> NaslResult like sync functions can.
Instead, the executor part of the Context is now an actual struct: Executor. This internally tracks all the registered functions and keeps all the necessary state (such as open ssh connections) around. Moreover, it provides the same functionality that NaslFunctionExecutor provided previously (namely executing functions by name).

The Executor can store multiple FunctionSets which are collections as functions. A new FunctionSet can be created as such:

pub struct Array;
function_set! {
    Array,
    sync_stateless,
    (
        make_array,
        make_list,
        (nasl_sort, "sort"),
        keys,
        max_index,
    )
}

which automatically implements the IntoFunctionSet trait on Array and adds the given 5 functions. The functions will automatically be entered with their own name unless specified otherwise (as for nasl_sort here). This replaces the lookup functions we had to write before.

At this point, NASL functions come in four flavors:

  1. async_stateful (for async fn(&S, &Register, &Context))
  2. sync_stateful (for fn(&S, &Register, &Context))
  3. async_stateless (for async fn(&Register, &Context))
  4. sync_stateless (for fn(&Register, &Context))

At the moment, the user (i.e. the person using the function_set! macro) unfortunately needs to specify what flavour of function the set contains. (This is because the rust compiler does not know that something implementing Fn(A, B) -> Z will never also implement Fn(A, B, C) -> Z, so that implementing a common trait for all four flavours is impossible since the compiler will think that we implemented the trait multiple times for a single object. Unfortunately, specialization is not yet a thing in Rust.).

In the future, I think it should be possible to do something very cool with the Executor: Previously, we only got read access to the state (i.e. to the SSH handles), so we had to employ interior mutability in order to work with this state if we ever needed mutable access (in order to add a new SSH handle to our list of handles, for example). Since all the states are now managed by a StatefulFunctionSet, and we can tell whether a given function needs mutable access or not by its signature, I think we should be able to use an external RwLock and improve this situation with some careful logic. Concretely, this could mean having Vec<Arc<Mutex<SSHSession>>> instead of Arc<Mutex<Vec<SSHSession>>> which we previously needed, which would enable tons of concurrency that wasn't previously possible. This is pretty futuristic at this point though.

Regressions

  • Removed ability to have a generic reader in HashSumLoader. This was done for simplicity and since we currently don't use this ability, however it can be reverted if necessary by adding Send bounds on the trait object.
  • Removed the ability to have a generic NaslFunctionExecutor and replaced it with a Executor which is a concrete struct. We still have the ability to only register any subset of functions that we want, so I don't think this is a true regression, but I'm writing it down for completeness' sake.
  • Previously, the NaslFunctionExecutor trait had a nasl_fn_cache_clear method. This wasn't really called properly anywhere other than in scannerctl and wasn't handled properly in the std library either. I don't think there are currently any instances that needed this functionality, since RAII should take care of it. However, if we ever need this, it is very easy to add back in again.
  • I had to revamp how HostInfo works to clean up the logic and make it possible to deal with asynchronous scans. First of all, I noticed that HostInfo internally stored something which I believe was supposed to be some kind of percentage progress. However, this progress was never actually updated or treated in any way. I removed this percentage value for now. This has an impact on how the update of a HostInfo works when a new result comes in and scanner.do_addition() returns true. I don't know what the expected behavior is there and since all tests still pass, I will just leave this as it is for now. Secondly, the new HostInfo stores the number of leftover VTs for each target host. In this way, it can properly track which hosts still have running VTs and which ones are finished. This was necessary because we now start multiple VTs at the same time, whereas previously they were done one by one.

TODO:

  • Fill in long list of todos
  • Add documentation for public methods
  • Address every new TODO
  • Reactivate commented tests
  • Fix conventional commits
  • Check if RunningScan needs to exist. Can it be replaced by ScanRunner altogether?
  • Rebase / merge
  • Split executor.rs in nasl_builtin_utils
  • Think about stateful/stateless sets again. Can we really not get rid of this? Stateless ones really only existed to be able to merge them without running into issues and because the functions don't have a self parameter.
  • Fix HostInfo mess
  • Improve run on ScanRunner.
  • Fix unwrap call in ScanRunner::new.

Before merging/closing:

  • Add Jira ticket for the possibility of allowing Executor to take &mut State.
  • Add Jira ticket for proper concurrent execution of scripts
  • Add Jira ticket for builtin libraries that can now simply be async
  • Add Jira ticket for TODO in Vt runner

Tehforsch and others added 13 commits July 26, 2024 04:09
To have the possibility to very nasl-interpreter implementation not just
from scannerctl but also openvasd it is integrated into openvasd.

To enable nasl-interpreter configure openvasd to use the scanner type
`openvasd`.
Since we need to share within openvasd storages by Arc and usually the
storage implementations are not written as mut we should not need a
lock.

Therefore Arc implementations for storage::Retrieve, store::Dispatcher,
openvasd#storage#Storage are implemented. This allows to create and
store and Arc storage.

Additionally todo implementation of storage::Storage are done for:
- file#Storage
- redis#Storage
within openvasd.

InMemory storage of openvasd has an underlying nasl-storage that is used
for everything but Results.

To prevent trying to start a new thread while async thread is running to
block the current thread which would be required to implement the
storage::Storage traits the tokio::sync::RwLock are replaced with
Arc<std::sync::RwLock> instead.

Unfortunately this means that each lock opration should be run within a
tokio::spawn blocking to not accidentally block the main thread of
tokio.
Accidentally removed kb item checks before starting a scan.
Adds status check for naslinterpreter#Scanner and changes u32 within
models::Status to u64 to get rid of down castings.
Because the scan_interpreter previously removed a VT when it was
executed,  it essentially only processed one host as tehre we no VTs
left for any host after that.

This is fixed by caching all VTs and keep track of:

- stage index
- VT index
- host index

that way, after the first host is done, we have each Stage step stored
in a Vector that in return contains all VTs that we then can lookup
based on the indices.
To allow the nasl-interprete scanner to delete kb items as well as
results as new trait `Remover` is added and an function abstraction
within `Storage` are added. This allows the scanner implementation to
allow kb items when a scan is finished as well as the customer to delete
results based on that implementation.
After each host nasl-interpreter/scanner does delete the KB items to be
able to run various target concurrently the ContextKey::Scan is added
with an optional target. The DefaultDispatcher stores KBs with the whole
Contextkey so that they are target and scan specific while  results are
stored with ContextKey::value that just returns the ScanID.

This fixes the bug that no results did show up when a scan was finished
the results got stored as `ScanID-Target` and it does get rid of the
Scan not found bug within scan_interpreter based on the same underlying
issue.
Switches from messagepack to json to be more stable when serializing
status. Refactored tests by providing a common openvasd client within
entry.
This is meant as a way to simplify migration to an async interpreter.
A blanket implementation of the normal NaslFunctionExecuter exists for
each SyncNaslFunctionExecuter. This way, NASL functions can be migrated
to async one by one.
@github-actions github-actions bot added the minor_release creates a minor release label Aug 12, 2024
@Tehforsch Tehforsch changed the base branch from main to impl-scan-starter August 13, 2024 07:00
@github-actions github-actions bot removed the minor_release creates a minor release label Aug 13, 2024
@Tehforsch Tehforsch force-pushed the async-interpreter branch 12 times, most recently from d7cbd28 to 84bedc5 Compare August 19, 2024 08:45
@nichtsfrei
Copy link
Member

Very demure, very mindful.

Actually it is impressive, I have minor feedback for the TestBuilder as it would be nice to store somewhere where it got created and print that additionally to the nasl code line so that it is easier to figure out the position of that failing test.

I don't like the name check_ok! as a macro name as it hides its actual complexity.

And it would be great if you reenable rust/nasl-interpreter/tests/local_var.rs

The rest can be ignored.

@github-actions github-actions bot added minor_release creates a minor release and removed minor_release creates a minor release labels Sep 5, 2024
@nichtsfrei nichtsfrei added no_release Disable automatic release label creation and removed minor_release creates a minor release labels Sep 5, 2024
Previously, the error message caused in a TestBuilder simply printed the
line of code that caused the error, without any location information on the
source of the error.
Now, the error message points to the exact line of code that failed.
@Tehforsch Tehforsch merged commit 8cb02bd into main Sep 5, 2024
24 of 25 checks passed
@Tehforsch Tehforsch deleted the async-interpreter branch September 5, 2024 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no_release Disable automatic release label creation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants