[Project Automation] Create High level tracker workflow + readme + helper script #288

jarmak-nv · 2025-02-28T21:24:41Z

This pull request introduces a new GitHub Actions workflow for high-level project tracking and provides supporting documentation and scripts. The main changes include the addition of the workflow configuration, a detailed README explaining the workflow, and a helper script to obtain necessary project and field IDs.

New GitHub Actions Workflow:

.github/workflows/project-high-level-tracker.yaml: Adds a comprehensive workflow to automate the management of GitHub Projects, including steps to process branch names, determine development stages, and update project fields based on release schedules.

Documentation:

docs/project-automation-readme.md: Provides an overview of the new workflow, its architecture, and configuration details. It includes a mermaid diagram to illustrate the workflow and instructions on how to obtain necessary project and field IDs.

Helper Script:

docs/project-graphql-helper.py: Adds a script to fetch project and field IDs from the GitHub GraphQL API, which are required for configuring the workflow.

jameslamb

I took a first pass at this, will do a more thorough review next week. Just publishing a non-blocking review now so you can see my first few comments.

jameslamb · 2025-02-28T22:06:13Z

.github/workflows/project-high-level-tracker.yaml

+    inputs:
+      PROJECT_ID:
+        description: "Project ID"
+        default: "PVT_kwDOAp2shc4ArK9w"


General comment for all of these default: values with what look like identifiers from a service.... what are they? Where did the values come from? Are they intended to be overridden?

If every user of this is expected to provide these values at the site where this is called, then I think that:

the defaults should be removed

the inputs should all have required: true

docs should be added here explaining how to set values for them (either how to find the things to hard-code, or how to get these via templating with the various runtime contexts GitHub Actions provides: https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/accessing-contextual-information-about-workflow-runs#github-context)

If these values are intended to be used literally and never overridden, then I think that:

they should be moved out of inputs and instead either hard-coded below or shared via the env. context (https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/accessing-contextual-information-about-workflow-runs#env-context)

If these values could be used literally but might be overridden, then I think that:

the inputs should all have required: true

docs should be added here explaining how to set values for them

code comments should be added explaining where these defaults came from

On the identifiers
They are graphQL nodeIDs; they are static values associated with GitHub projects. Unfortunately, there's no clean way of dealing with these; either we hard-code them to store them, or we re-query their values using logic shown in the helper python script in docs every time we run them. While they look like private keys etc, they're totally public and anyone can query those values.

On defaults:

To make it so caller workflows can be as simple as possible

To allow for flexibility should others want to maintain their own projects using the same structure

If this setup isn't something we're happy with, then I'd go with the hard-code in env. Goal 1 is more important than goal 2.

anyone can query those values.

Ok but how? Let's imagine you're not available and this breaks complaining about a project with this ID not existing... how would I go find the appropriate values?

The other thing I'm confused about... if it's ok to hard-code a single project's ID, then why is this workflow getting checked into shared-workflows (which contains workflows that are intended to be re-used in many places), instead of in whatever repo the project board is associated with?

Using logic shown in the helper python script

Ok, I see now that you have docs on how to get these values. Great!

https://github.com/jarmak-nv/shared-workflows/blob/high-level-tracker/docs/project-automation-readme.md#getting-project-and-field-ids

It'd still be helpful to have comments in these files indicated where these specific defaults came from. That's information I'd need to know to update these, or to choose whether or not to override those defaults in calls to these workflows from other repos.

.github/workflows/project-high-level-tracker.yaml

jameslamb · 2025-02-28T22:14:44Z

docs/project-automation-readme.md

+- Linking PRs to issues and keeping them in sync requires complex operations
+   - Work often begins in an Issue, but then is completed in the PR, requiring this synchronization


I don't understand these workflows and the GraphQL code very well, so apologies if this is an ignorant question... does any of this automation assume that 1 issue will be resolved by exactly 1 PR?

The language here (talking about "the" PR) makes me think that it might.

I think that's an assumption that's likely to be violated often, especially for large efforts like rapidsai/cuvs#735

Yeah totally agree it's a little confusing; for any given workflow-triggering PR the update-linked-issues workflow synchronizes all issues that are linked to that PR (assuming they are in the same project as the PR).

Ultimately, GitHub itself doesn't have a strong mechanism for delineating the 1-to-many or many-to-many relationship of PRs and Issues in their linkages. If you link 5 PRs to 1 issue, closing any of the PRs closes all of the issues.

The linkage between the two lives inside the PR, so I say "the PR" referring to the PR that triggers the workflow, but can totally see changing it or adding some clarity!

Added some text, lmk if it adds any clarity

Added some text, lmk if it adds any clarity

If you link 5 PRs to 1 issue, closing any of the PRs closes all of the issues.

Thanks. I re-read the docs today and it's good to see they no longer imply a 1-to-1 relationship between PRs and issues.

I think the biggest thing I'm still really unsure about it what you mean (in comments here and in the docs) by "link to".

For example, if I leave a comment like "Contributes to #17963" in a PR's description or in a comment, I have "linked" the PR to the issue.. GitHub will generate an update on the corresponding issues like these:

(ref: microsoft/LightGBM#3867)

But those types of linkages don't result in the behavior "closing the PR closes the issue".

I suspect you mean specifically using one of these keywords: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue

If so, I'd add a link to that in the docs.

ajschmidt8 · 2025-03-04T13:44:50Z

.github/workflows/project-high-level-tracker.yaml

+          if [[ "$CURRENT_DATE" < "$DEV_START" ]]; then
+            STAGE_ID="${{ inputs.STAGE_PLANNED_NAME }}"
+            echo "Stage: Planned"
+          elif [[ "$CURRENT_DATE" < "$BURNDOWN_START" ]]; then
+            STAGE_ID="${{ inputs.STAGE_DEVELOPMENT_NAME }}"
+            echo "Stage: Development"
+          elif [[ "$CURRENT_DATE" < "$FREEZE_START" ]]; then
+            STAGE_ID="${{ inputs.STAGE_BURNDOWN_NAME }}"
+            echo "Stage: Burndown"
+          elif [[ "$CURRENT_DATE" < "$RELEASE_DATE" ]]; then
+            STAGE_ID="${{ inputs.STAGE_CODE_FREEZE_NAME }}"
+            echo "Stage: Code Freeze"
+          else
+            STAGE_ID="${{ inputs.STAGE_HOTFIX_NAME }}"
+            echo "Stage: Hotfix"


We should move as many of the ${{ }} pre-processor lines as possible to environment variables to prevent script injection attacks.

That goes for this block and any other lines in this workflow.

Yeah fair, if these stay as variables vs going hard-coded per James' comment above I'll update.

I don't think this is a vector for script injection though because the only way these variables get set is inside the action (they aren't affected by a user defined field in a workflow-triggering PR). If a bad actor attempted to set the values in their own branch, it wouldn't have an effect unless we merged it.

One of the frustrating parts of these workflows is that testing is challenging because they need to be merged into the default branch to be functional, but it does provide protection against script injection as a benefit in many cases!

jarmak-nv added 4 commits February 27, 2025 15:01

Add GitHub workflow for RAPIDS high-level project tracking

635221d

Add project automation documentation and GraphQL helper script

6b33ff7

Update project automation documentation with improved diagram

89ce393

Update mermaid to LR from TD

48c4c3f

jarmak-nv requested a review from a team as a code owner February 28, 2025 21:24

jarmak-nv requested a review from gforsyth February 28, 2025 21:24

jarmak-nv added non-breaking Introduces a non-breaking change doc Improvements or additions to documentation feature request New feature or request and removed doc Improvements or additions to documentation labels Feb 28, 2025

Update GitHub Note Syntax

87817a6

jameslamb self-requested a review February 28, 2025 21:56

jameslamb reviewed Feb 28, 2025

View reviewed changes

Use stage names instead of IDs, multi-issue-pr readme update

1d7d605

ajschmidt8 reviewed Mar 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Project Automation] Create High level tracker workflow + readme + helper script #288

[Project Automation] Create High level tracker workflow + readme + helper script #288

Uh oh!

jarmak-nv commented Feb 28, 2025

Uh oh!

jameslamb left a comment

Uh oh!

jameslamb Feb 28, 2025

Uh oh!

jarmak-nv Mar 3, 2025

Uh oh!

jameslamb Mar 19, 2025

Uh oh!

jameslamb Mar 19, 2025

Uh oh!

Uh oh!

jameslamb Feb 28, 2025

Uh oh!

jarmak-nv Mar 3, 2025 •

edited

Loading

Uh oh!

jarmak-nv Mar 3, 2025

Uh oh!

jameslamb Mar 19, 2025

Uh oh!

ajschmidt8 Mar 4, 2025

Uh oh!

jarmak-nv Mar 4, 2025

Uh oh!

Uh oh!

		- Linking PRs to issues and keeping them in sync requires complex operations
		- Work often begins in an Issue, but then is completed in the PR, requiring this synchronization

[Project Automation] Create High level tracker workflow + readme + helper script #288

Are you sure you want to change the base?

[Project Automation] Create High level tracker workflow + readme + helper script #288

Uh oh!

Conversation

jarmak-nv commented Feb 28, 2025

New GitHub Actions Workflow:

Documentation:

Helper Script:

Uh oh!

jameslamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jarmak-nv Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jarmak-nv Mar 3, 2025 •

edited

Loading