Skip to content

Conversation

@Sukrit5
Copy link

@Sukrit5 Sukrit5 commented Apr 11, 2025

This pull request aims to provide support for the tyche analysis tool, to resolve #254. When BOLERO_TYCHE=true cargo test is run, this should create a new file, tyche_test.jsonl, that can be fed into the tyche vscode extension, for example. I'm working with @hgoldstein95 on this

Summary of changes made

-Modified run_with_value to return the representation, even when there is no error. This was because test.generate_value doesn't seem to be callable within the run_tests function. run_with_scope was also modified for the sake of symmetry, to return a placeholder representation
-Modified the outcome branch to have additional fields for representation, features, coverage, and some other fields necessary for tyche. Also added an output_json function that outputs json to the file.
-Modified run_tests to set the representation and features, then output json if the tyche environmental variable is set
-Added a static variable that allows users to use the function event_with_payload in order to set features. This is meant to be like how Hypothesis does events. I used a static global variable because I couldn't figure out how to make an internal variable accessible to someone writing within the function passed into the for_each method
-Made item_path public in lib/bolero-engine/src/target_location.rs so I could use it to set the property name. Let me know if this has any negative effects, and I'll try to figure out a different approach
-Added dependencies on serde_json and lazy_static. Again, let me know if you want the serialization done without using serde_json

Sukrit Mangla and others added 2 commits April 9, 2025 11:37
modify outcome struct to work with tyche

update new function to reflect new fields

add method to output json data to file called tyche_test

add necessary setters

use outcome in mod.rs

made one function public so it could work

updated cargo.toml

finalize static stuff and syntax cleanup

fixed error relating to error representations
@camshaft
Copy link
Owner

Hey it looks like a lot of this is based on #268. I would prefer to merge those changes independently and then we can build on top of it here. I can work on that today.

In general, I would prefer to avoid dependencies since it makes guaranteeing a MSRV difficult. Generating JSON is also a lot less error prone than parsing it so it seems a bit heavy-weight to pull in serde for that.

report.spawn_timer();
}
let mut outcome = outcome::Outcome::new(&self.location, start_time);
let tyche_on = env::var("BOLERO_TYCHE")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having this as a bool, i think it would be better to have it pass a path that it wants us to write to. Putting the json in the pwd seems a bit error prone to me

use std::sync::Mutex;

lazy_static! {
static ref GLOBAL_CONTEXT: Mutex<Option<HashMap<String, String>>> = Mutex::new(None);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow why we need a global context for this. Also keep in mind that cargo's test harness runs in parallel so this will likely be corrupted due to concurrent writers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this stood out to me too. Although when Sukrit and I were going through the code it wasn't immediately clear to me what the best alternative was.

It seems like one option might be to add a with_events method to TestTarget that would allow users to pre-register a function that computes events? That isn't a very convenient API, but it would make it easier to thread the events through the system.

Alternatively we could have a for_all_with_events method that passes an event callback into the for_all body?

Both of these approaches feel a bit over-complicated, so I wanted to see if you had any better ideas.

Copy link
Owner

@camshaft camshaft Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess taking a step back: what are we aiming to accomplish with this? Regardless of the implementation I'm not sure what we're trying to solve with the context :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point :)

The goal here is to allow users to emit data during their test that Tyche can plot. For example, in a test that takes arrays as input, we might want to keep track of their lengths or averages. In Python's Hypothesis framework, we do this with "events." The user writes

event("length", len(xs))

in the body of their property, and then that gets added into the Tyche JSON output. (A similar thing exists in QuickCheck, they call them "labels.")

So at a high level, that's the goal: allow the user to specify key-value pairs that should be added to the Tyche output.

Does that make sense?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that makes sense. I'm wondering if it would be better to integrate with something like tracing or metrics, since that's essentially what this seems to be doing. I'm leaning toward separating this part of the integration into a different changeset and just focusing on the core integration for now, though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea. Tyche can be useful without the features, so it wouldn't hurt to ship the initial integration first.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also gets at a feature I've been wanting to add for awhile which is getting rich statistics out of a bolero harness in order to identify performance issues. For example, plotting the relationship of the input size to execution time could highlight a O(N^2) behavior where we expected it to be O(N).

Comment on lines 200 to 202
let value = test.generate_value(&mut input);
let representation = format!("{:?}", value);
Ok((is_valid, representation))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to incur the costs of serializing this when the log is disabled

Comment on lines 321 to 323
// For scope tests, we don't have a good way to get a representation
// so we'll use a placeholder
(r, "scope test".to_string())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having placeholder text, I think it might be better to not have any output at all? Longer term, I think it would be nice to hook into the any calls with a logging driver. I can open an issue for that.

pub fn output_json(&self) -> std::io::Result<()>{
let status = match &self.exit_reason {
Some(ExitReason::TestFailure) => "failed",
Some(ExitReason::MaxDurationExceeded { .. }) => "timed out",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't timing out. It's just indicating that we stopped iterating due to the configured test time expiring. Currently we default to either 1000 items or 1 second tests.

Comment on lines 157 to 163
let how_generated = if self.corpus_input > 0 {
"loaded from corpus"
} else if self.rng_input > 0 {
"generated during unknown phase"
} else {
"unknown"
};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what we're trying to say here. Note that this Outcome struct is an aggregate so those counters are persisted across inputs.

@camshaft
Copy link
Owner

The other thing to point out here - this change would only enable RNG and corpus-replay tests to emit this information. That means all of the fuzzer engines would not be included, which I think is valuable, especially when comparing how effective each fuzzer is as opposed to pure RNG.

@Sukrit5
Copy link
Author

Sukrit5 commented Apr 18, 2025

Ok, I think I've made some commits that address these issues. I haven't yet added functionality for the fuzzer engines, because I think that might take some more time, but I've made some changes to address every other comment.

@Sukrit5
Copy link
Author

Sukrit5 commented May 9, 2025

Just checking on this, is there anything else you want me to work on for the first part of Tyche support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for Tyche analysis tool

3 participants