Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 88 additions & 2 deletions src/bin/pr-metadata-validator.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use std::{collections::BTreeMap, process::exit};

use anyhow::{Context, anyhow};
use chrono::NaiveDate;
use indexmap::IndexMap;
use maplit::btreemap;
Expand All @@ -8,7 +9,7 @@ use regex::Regex;
use trainee_tracker::{
Error,
config::{CourseSchedule, CourseScheduleWithRegisterSheetId},
course::match_prs_to_assignments,
course::{get_descriptor_id_for_pr, match_prs_to_assignments},
newtypes::Region,
octocrab::octocrab_for_token,
prs::get_prs,
Expand Down Expand Up @@ -84,6 +85,8 @@ async fn main() {
&format!("{}{}", BAD_TITLE_COMMENT_PREFIX, reason)
}
ValidationResult::UnknownRegion => UNKNOWN_REGION_COMMENT,
ValidationResult::WrongFiles { files } => &format!("{}{}`", WRONG_FILES, files),
ValidationResult::NoFiles => NO_FILES,
};

let full_message = format!(
Expand Down Expand Up @@ -138,12 +141,24 @@ const UNKNOWN_REGION_COMMENT: &str = r#"Your PR's title didn't contain a known r

Please check the expected title format, and make sure your region is in the correct place and spelled correctly."#;

const WRONG_FILES: &str = r#"The changed files in this PR don't match what is expected for this task.

Please check that you committed the right files for the task, and that there are no accidentally committed files from other sprints.

Please review the changed files tab at the top of the page, we are only expecting changes in this directory: `"#;

const NO_FILES: &str = r#"This PR is missing any submitted files.

Please check that you committed the right files and pushed to the repository"#;

enum ValidationResult {
Ok,
BodyTemplateNotFilledOut,
CouldNotMatch,
BadTitleFormat { reason: String },
UnknownRegion,
WrongFiles { files: String },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure at the call-site whether this was the expected files or the incorrect files:

Suggested change
WrongFiles { files: String },
WrongFiles { expected_files_pattern: String },

NoFiles,
}

async fn validate_pr(
Expand All @@ -166,7 +181,7 @@ async fn validate_pr(
.iter()
.find(|pr| pr.number == pr_number)
.ok_or_else(|| {
anyhow::anyhow!(
anyhow!(
"Failed to find PR {} in list of PRs for module {}",
pr_number,
module_name
Expand Down Expand Up @@ -234,9 +249,80 @@ async fn validate_pr(
return Ok(ValidationResult::BodyTemplateNotFilledOut);
}

let pr_assignment_descriptor_id = get_descriptor_id_for_pr(matched.sprints, pr_number);

match check_pr_file_changes(
octocrab,
github_org_name,
module_name,
pr_number,
pr_assignment_descriptor_id,
)
.await
{
Ok(Some(problem)) => return Ok(problem),
Ok(None) => (),
Err(e) => {
let _ = anyhow!(e);
}
}

Ok(ValidationResult::Ok)
}

// Check the changed files in a pull request match what is expected for that sprint task
async fn check_pr_file_changes(
octocrab: &Octocrab,
org_name: &str,
module_name: &str,
pr_number: u64,
task_issue_number: u64,
) -> Result<Option<ValidationResult>, Error> {
// Get the Sprint Task's description of expected changes
let task_issue = match octocrab
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match works here (and throughout), but FYI there's also a concept of guard clases - you could also write this:

    let Ok(task_issue) = octocrab
        .issues(org_name, module_name)
        .get(task_issue_number)
        .await
    else {
        return Ok(Some(ValidationResult::CouldNotMatch)); // Failed to find the right task
    };

I don't have a strong preference either way here - each can be more clear in different contexts, but it's good to know this is possible.

.issues(org_name, module_name)
.get(task_issue_number)
.await
{
Ok(iss) => iss,
Err(_) => return Ok(Some(ValidationResult::CouldNotMatch)), // Failed to find the right task
};
let task_issue_body = match task_issue.body {
Some(body) => body,
None => return Ok(None), // Task is empty, nothing left to check
};
Comment on lines +290 to +293
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but I may consider:

Suggested change
let task_issue_body = match task_issue.body {
Some(body) => body,
None => return Ok(None), // Task is empty, nothing left to check
};
let task_issue_body = task_issue.body.unwrap_or_default();

To end up with an empty string which we can just handle as if the issue body was the empty string.

I'm not sure why issue bodies are Option<String> rather than just String - I can't imagine an issue without a body, but maybe some APIs don't return them or something...

let directory_description = Regex::new("CHANGE_DIR=(.+)\\n").unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally follow a convention that unwrap calls should either:

  1. Have a comment above describing why they can't fail (e.g. // UNWRAP: Statically known good regex), or
  2. Be replaced with an .expect("explanation") (e.g. .expect("Statically known regex failed to parse"))

Same applies below on line 296.

let directory_description_regex = match directory_description.captures(&task_issue_body) {
Some(capts) => capts.get(1).unwrap().as_str(), // Only allows a single directory for now
None => return Ok(None), // There is no match defined for this task, don't do any more checks
};
let directory_matcher = Regex::new(directory_description_regex)
.context("Invalid regex for task directory match")?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a link to the GitHub issue to this context, as well, so we know where to look/fix:

Suggested change
.context("Invalid regex for task directory match")?;
.with_context(|| format!("Invalid regex for task directory match in issue {}", task_issue.html__url))?;

// Get all of the changed files
let pr_files_pages = octocrab
.pulls(org_name, module_name)
.list_files(pr_number)
.await
.context("Failed to get changed files")?;
if pr_files_pages.items.is_empty() {
return Ok(Some(ValidationResult::NoFiles)); // no files committed
}
let pr_files_all = octocrab
.all_pages(pr_files_pages)
.await
.context("Failed to list all changed files")?;
let pr_files = pr_files_all.into_iter();
Comment on lines +301 to +314
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a handy util for this in trainee_tracker::octocrab::all_pages - you'll probably need to use it and make it pub instead of pub(crate):

Suggested change
// Get all of the changed files
let pr_files_pages = octocrab
.pulls(org_name, module_name)
.list_files(pr_number)
.await
.context("Failed to get changed files")?;
if pr_files_pages.items.is_empty() {
return Ok(Some(ValidationResult::NoFiles)); // no files committed
}
let pr_files_all = octocrab
.all_pages(pr_files_pages)
.await
.context("Failed to list all changed files")?;
let pr_files = pr_files_all.into_iter();
let pr_files = all_pages("changed files", octocrab, async || {
octocrab
.pulls(org_name, module_name)
.list_files(pr_number)
.await
}).await?;
if pr_files.is_empty() {
return Ok(Some(ValidationResult::NoFiles)); // no files committed
}

// check each file and error if one is in unexpected place
for pr_file in pr_files {
if !directory_matcher.is_match(&pr_file.filename) {
return Ok(Some(ValidationResult::WrongFiles {
files: directory_description_regex.to_string(),
}));
}
}
Ok(None)
}

struct KnownRegions(BTreeMap<&'static str, Vec<&'static str>>);

impl KnownRegions {
Expand Down
37 changes: 37 additions & 0 deletions src/course.rs
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ fn parse_issue(issue: &Issue) -> Result<Option<(NonZeroUsize, Option<Assignment>
labels,
title,
html_url,
number,
..
} = issue;

Expand Down Expand Up @@ -227,6 +228,7 @@ fn parse_issue(issue: &Issue) -> Result<Option<(NonZeroUsize, Option<Assignment>
title: title.clone(),
html_url: html_url.clone(),
optionality,
assignment_descriptor_id: *number,
}),
"Codility" => {
// TODO: Handle these.
Expand Down Expand Up @@ -312,6 +314,7 @@ pub enum Assignment {
ExpectedPullRequest {
title: String,
html_url: Url,
assignment_descriptor_id: u64,
optionality: AssignmentOptionality,
},
}
Expand Down Expand Up @@ -438,6 +441,7 @@ impl TraineeWithSubmissions {
SubmissionState::Some(Submission::PullRequest {
pull_request,
optionality,
..
}) => {
let max = match optionality {
AssignmentOptionality::Mandatory => 10,
Expand Down Expand Up @@ -539,6 +543,7 @@ pub enum Submission {
PullRequest {
pull_request: Pr,
optionality: AssignmentOptionality,
assignment_descriptor: u64,
},
}

Expand Down Expand Up @@ -990,23 +995,55 @@ fn match_pr_to_assignment(
}
}
}

if let Some(Match {
sprint_index,
assignment_index,
optionality,
..
}) = best_match
{
let pr_assignment_descriptor_id: u64 =
match assignments[sprint_index].assignments[assignment_index] {
Assignment::ExpectedPullRequest {
assignment_descriptor_id,
..
} => assignment_descriptor_id,
_ => 0,
};
submissions[sprint_index].submissions[assignment_index] =
SubmissionState::Some(Submission::PullRequest {
pull_request: pr,
optionality,
assignment_descriptor: pr_assignment_descriptor_id,
});
} else if !pr.is_closed {
unknown_prs.push(pr);
}
}

// Given a vector of sprints, and a target pr number, for a given person
// return the issue ID for the associated assignment descriptor
pub fn get_descriptor_id_for_pr(sprints: Vec<SprintWithSubmissions>, target_pr_number: u64) -> u64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than defaulting to 0, I'd expect this to either:

  1. Return an Option<u64> so that we can explicitly signal "We couldn't find the descriptor ID"
  2. Return an Option<&Submission>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you're just returning a number, you don't need to take ownership of this Vec - and if you do, you shouldn't need to clone the submissions below on line 1030.

In general there are two ways of iterating over things - with .iter() (which doesn't require ownership, but means you may need to copy things if you're returning them), or with .into_iter() (which consumes the value, but avoids the need to copy things).

Here are two different ways to write the code you have at the moment:

  1. Taking ownership - note the use of into_iter() and the lack of clone():
// Given a vector of sprints, and a target pr number, for a given person
// return the issue ID for the associated assignment descriptor
pub fn get_descriptor_id_for_pr(sprints: Vec<SprintWithSubmissions>, target_pr_number: u64) -> u64 {
    match sprints
        .into_iter()
        .flat_map(|sprint_with_subs| sprint_with_subs.submissions)
        .filter_map(|missing_or_submission| match missing_or_submission {
            SubmissionState::Some(s) => Some(s),
            _ => None,
        })
        .find(|submission| match submission {
            Submission::PullRequest { pull_request, .. } => pull_request.number == target_pr_number,
            _ => false,
        }) {
        Some(Submission::PullRequest {
            assignment_descriptor,
            ..
        }) => assignment_descriptor,
        _ => 0,
    }
}
  1. Not taking ownership:
// Given a vector of sprints, and a target pr number, for a given person
// return the issue ID for the associated assignment descriptor
pub fn get_descriptor_id_for_pr(sprints: &Vec<SprintWithSubmissions>, target_pr_number: u64) -> u64 {
    match sprints
        .iter()
        .flat_map(|sprint_with_subs| sprint_with_subs.submissions.iter()) // Note: We explicitly call `.iter()` here to make sure we're borrowing when iterating over this, rather than taking ownership - by default there are "just iterate this for me" style coercions which will implicitly use `.into_iter()` by default, so we need to be explicit here to avoid the clones)
        .filter_map(|missing_or_submission| match missing_or_submission {
            SubmissionState::Some(s) => Some(s),
            _ => None,
        })
        .find(|submission| match submission {
            Submission::PullRequest { pull_request, .. } => pull_request.number == target_pr_number,
            _ => false,
        }) {
        Some(Submission::PullRequest {
            assignment_descriptor,
            ..
        }) => *assignment_descriptor, // Note the * - this is equivalent to calling `.clone()` but because `u64` is `Copy` we can just dereference to copy it. This is cheap. Personally I don't love using `*assignment_descriptor` and would prefer `assignment_descriptor.clone()` but the language community doesn't, so...
        _ => 0,
    }
}

Extending 2 a little bit, we can actually write the signature:

pub fn get_descriptor_id_for_pr(sprints: &[SprintWithSubmissions], target_pr_number: u64) -> u64 {

replacing the &Vec<SprintWithSubmissions> with &[SprintWithSubmissions] because a Vec can be treated like a slice - this is slightly more general, because it allows us to pass more types to this function.

I'd recommend avoiding both taking ownership where not needed, and the extra clones, by doing approach 2 :)

match sprints
.iter()
.flat_map(|sprint_with_subs| sprint_with_subs.submissions.clone())
.filter_map(|missing_or_submission| match missing_or_submission {
SubmissionState::Some(s) => Some(s),
_ => None,
})
.find(|submission| match submission {
Submission::PullRequest { pull_request, .. } => pull_request.number == target_pr_number,
_ => false,
}) {
Some(Submission::PullRequest {
assignment_descriptor,
..
}) => assignment_descriptor,
_ => 0,
}
}

fn make_title_more_matchable(title: &str) -> IndexSet<String> {
use itertools::Itertools;

Expand Down