-
-
Notifications
You must be signed in to change notification settings - Fork 0
File count validator #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@illicitonion Here is the validator for checking the right files are committed. Let me know what you think, in particular if I've made any faux-pas in rust, re the matching and error handling. I'm still getting used to that. |
illicitonion
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay here! This looks good, but I left a few comments about some general Rust things :)
| CouldNotMatch, | ||
| BadTitleFormat { reason: String }, | ||
| UnknownRegion, | ||
| WrongFiles { files: String }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure at the call-site whether this was the expected files or the incorrect files:
| WrongFiles { files: String }, | |
| WrongFiles { expected_files_pattern: String }, |
| let task_issue_body = match task_issue.body { | ||
| Some(body) => body, | ||
| None => return Ok(None), // Task is empty, nothing left to check | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works, but I may consider:
| let task_issue_body = match task_issue.body { | |
| Some(body) => body, | |
| None => return Ok(None), // Task is empty, nothing left to check | |
| }; | |
| let task_issue_body = task_issue.body.unwrap_or_default(); |
To end up with an empty string which we can just handle as if the issue body was the empty string.
I'm not sure why issue bodies are Option<String> rather than just String - I can't imagine an issue without a body, but maybe some APIs don't return them or something...
| // Get all of the changed files | ||
| let pr_files_pages = octocrab | ||
| .pulls(org_name, module_name) | ||
| .list_files(pr_number) | ||
| .await | ||
| .context("Failed to get changed files")?; | ||
| if pr_files_pages.items.is_empty() { | ||
| return Ok(Some(ValidationResult::NoFiles)); // no files committed | ||
| } | ||
| let pr_files_all = octocrab | ||
| .all_pages(pr_files_pages) | ||
| .await | ||
| .context("Failed to list all changed files")?; | ||
| let pr_files = pr_files_all.into_iter(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a handy util for this in trainee_tracker::octocrab::all_pages - you'll probably need to use it and make it pub instead of pub(crate):
| // Get all of the changed files | |
| let pr_files_pages = octocrab | |
| .pulls(org_name, module_name) | |
| .list_files(pr_number) | |
| .await | |
| .context("Failed to get changed files")?; | |
| if pr_files_pages.items.is_empty() { | |
| return Ok(Some(ValidationResult::NoFiles)); // no files committed | |
| } | |
| let pr_files_all = octocrab | |
| .all_pages(pr_files_pages) | |
| .await | |
| .context("Failed to list all changed files")?; | |
| let pr_files = pr_files_all.into_iter(); | |
| let pr_files = all_pages("changed files", octocrab, async || { | |
| octocrab | |
| .pulls(org_name, module_name) | |
| .list_files(pr_number) | |
| .await | |
| }).await?; | |
| if pr_files.is_empty() { | |
| return Ok(Some(ValidationResult::NoFiles)); // no files committed | |
| } |
| Some(body) => body, | ||
| None => return Ok(None), // Task is empty, nothing left to check | ||
| }; | ||
| let directory_description = Regex::new("CHANGE_DIR=(.+)\\n").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally follow a convention that unwrap calls should either:
- Have a comment above describing why they can't fail (e.g.
// UNWRAP: Statically known good regex), or - Be replaced with an
.expect("explanation")(e.g..expect("Statically known regex failed to parse"))
Same applies below on line 296.
| None => return Ok(None), // There is no match defined for this task, don't do any more checks | ||
| }; | ||
| let directory_matcher = Regex::new(directory_description_regex) | ||
| .context("Invalid regex for task directory match")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a link to the GitHub issue to this context, as well, so we know where to look/fix:
| .context("Invalid regex for task directory match")?; | |
| .with_context(|| format!("Invalid regex for task directory match in issue {}", task_issue.html__url))?; |
|
|
||
| // Given a vector of sprints, and a target pr number, for a given person | ||
| // return the issue ID for the associated assignment descriptor | ||
| pub fn get_descriptor_id_for_pr(sprints: Vec<SprintWithSubmissions>, target_pr_number: u64) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than defaulting to 0, I'd expect this to either:
- Return an
Option<u64>so that we can explicitly signal "We couldn't find the descriptor ID" - Return an
Option<&Submission>
|
|
||
| // Given a vector of sprints, and a target pr number, for a given person | ||
| // return the issue ID for the associated assignment descriptor | ||
| pub fn get_descriptor_id_for_pr(sprints: Vec<SprintWithSubmissions>, target_pr_number: u64) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you're just returning a number, you don't need to take ownership of this Vec - and if you do, you shouldn't need to clone the submissions below on line 1030.
In general there are two ways of iterating over things - with .iter() (which doesn't require ownership, but means you may need to copy things if you're returning them), or with .into_iter() (which consumes the value, but avoids the need to copy things).
Here are two different ways to write the code you have at the moment:
- Taking ownership - note the use of
into_iter()and the lack ofclone():
// Given a vector of sprints, and a target pr number, for a given person
// return the issue ID for the associated assignment descriptor
pub fn get_descriptor_id_for_pr(sprints: Vec<SprintWithSubmissions>, target_pr_number: u64) -> u64 {
match sprints
.into_iter()
.flat_map(|sprint_with_subs| sprint_with_subs.submissions)
.filter_map(|missing_or_submission| match missing_or_submission {
SubmissionState::Some(s) => Some(s),
_ => None,
})
.find(|submission| match submission {
Submission::PullRequest { pull_request, .. } => pull_request.number == target_pr_number,
_ => false,
}) {
Some(Submission::PullRequest {
assignment_descriptor,
..
}) => assignment_descriptor,
_ => 0,
}
}- Not taking ownership:
// Given a vector of sprints, and a target pr number, for a given person
// return the issue ID for the associated assignment descriptor
pub fn get_descriptor_id_for_pr(sprints: &Vec<SprintWithSubmissions>, target_pr_number: u64) -> u64 {
match sprints
.iter()
.flat_map(|sprint_with_subs| sprint_with_subs.submissions.iter()) // Note: We explicitly call `.iter()` here to make sure we're borrowing when iterating over this, rather than taking ownership - by default there are "just iterate this for me" style coercions which will implicitly use `.into_iter()` by default, so we need to be explicit here to avoid the clones)
.filter_map(|missing_or_submission| match missing_or_submission {
SubmissionState::Some(s) => Some(s),
_ => None,
})
.find(|submission| match submission {
Submission::PullRequest { pull_request, .. } => pull_request.number == target_pr_number,
_ => false,
}) {
Some(Submission::PullRequest {
assignment_descriptor,
..
}) => *assignment_descriptor, // Note the * - this is equivalent to calling `.clone()` but because `u64` is `Copy` we can just dereference to copy it. This is cheap. Personally I don't love using `*assignment_descriptor` and would prefer `assignment_descriptor.clone()` but the language community doesn't, so...
_ => 0,
}
}Extending 2 a little bit, we can actually write the signature:
pub fn get_descriptor_id_for_pr(sprints: &[SprintWithSubmissions], target_pr_number: u64) -> u64 {replacing the &Vec<SprintWithSubmissions> with &[SprintWithSubmissions] because a Vec can be treated like a slice - this is slightly more general, because it allows us to pass more types to this function.
I'd recommend avoiding both taking ownership where not needed, and the extra clones, by doing approach 2 :)
| task_issue_number: u64, | ||
| ) -> Result<Option<ValidationResult>, Error> { | ||
| // Get the Sprint Task's description of expected changes | ||
| let task_issue = match octocrab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match works here (and throughout), but FYI there's also a concept of guard clases - you could also write this:
let Ok(task_issue) = octocrab
.issues(org_name, module_name)
.get(task_issue_number)
.await
else {
return Ok(Some(ValidationResult::CouldNotMatch)); // Failed to find the right task
};I don't have a strong preference either way here - each can be more clear in different contexts, but it's good to know this is possible.
Adds a file validator for PR submissions.
The directory files should be changed in is defined in an issue body, in a comment, like so:
If it is not present, the bot does not do this check
If it is present, this regexp is checked against every changed file present in a PR submission
If any fail to match, i.e. something in a wrong directory was changed, then the bot issues a warning
If no files are submitted at all, the bot issues a warning
I propose any metadata that needs checked be included in these style of comments, and this metadata will need to be added for every task submission before the bot will be able to perform checks
Currently only supports checking a single directory, but could be extended to have multiple, could also extend to do things like checking max number of committed files or any other metadata visible to PRs
You can see an example of output of this check here: CodeYourFuture/Module-Structuring-and-Testing-Data#873