Skip to content

[Project Automation] Create High level tracker workflow + readme + helper script #288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: branch-25.04
Choose a base branch
from

Conversation

jarmak-nv
Copy link
Contributor

This pull request introduces a new GitHub Actions workflow for high-level project tracking and provides supporting documentation and scripts. The main changes include the addition of the workflow configuration, a detailed README explaining the workflow, and a helper script to obtain necessary project and field IDs.

New GitHub Actions Workflow:

  • .github/workflows/project-high-level-tracker.yaml: Adds a comprehensive workflow to automate the management of GitHub Projects, including steps to process branch names, determine development stages, and update project fields based on release schedules.

Documentation:

  • docs/project-automation-readme.md: Provides an overview of the new workflow, its architecture, and configuration details. It includes a mermaid diagram to illustrate the workflow and instructions on how to obtain necessary project and field IDs.

Helper Script:

  • docs/project-graphql-helper.py: Adds a script to fetch project and field IDs from the GitHub GraphQL API, which are required for configuring the workflow.

@jarmak-nv jarmak-nv requested a review from a team as a code owner February 28, 2025 21:24
@jarmak-nv jarmak-nv requested a review from gforsyth February 28, 2025 21:24
@jarmak-nv jarmak-nv added non-breaking Introduces a non-breaking change doc Improvements or additions to documentation feature request New feature or request and removed doc Improvements or additions to documentation labels Feb 28, 2025
@jameslamb jameslamb self-requested a review February 28, 2025 21:56
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a first pass at this, will do a more thorough review next week. Just publishing a non-blocking review now so you can see my first few comments.

inputs:
PROJECT_ID:
description: "Project ID"
default: "PVT_kwDOAp2shc4ArK9w"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment for all of these default: values with what look like identifiers from a service.... what are they? Where did the values come from? Are they intended to be overridden?

If every user of this is expected to provide these values at the site where this is called, then I think that:

If these values are intended to be used literally and never overridden, then I think that:

If these values could be used literally but might be overridden, then I think that:

  • the inputs should all have required: true
  • docs should be added here explaining how to set values for them
  • code comments should be added explaining where these defaults came from

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the identifiers
They are graphQL nodeIDs; they are static values associated with GitHub projects. Unfortunately, there's no clean way of dealing with these; either we hard-code them to store them, or we re-query their values using logic shown in the helper python script in docs every time we run them. While they look like private keys etc, they're totally public and anyone can query those values.

On defaults:

  1. To make it so caller workflows can be as simple as possible
  2. To allow for flexibility should others want to maintain their own projects using the same structure

If this setup isn't something we're happy with, then I'd go with the hard-code in env. Goal 1 is more important than goal 2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyone can query those values.

Ok but how? Let's imagine you're not available and this breaks complaining about a project with this ID not existing... how would I go find the appropriate values?

The other thing I'm confused about... if it's ok to hard-code a single project's ID, then why is this workflow getting checked into shared-workflows (which contains workflows that are intended to be re-used in many places), instead of in whatever repo the project board is associated with?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using logic shown in the helper python script

Ok, I see now that you have docs on how to get these values. Great!

https://github.com/jarmak-nv/shared-workflows/blob/high-level-tracker/docs/project-automation-readme.md#getting-project-and-field-ids

It'd still be helpful to have comments in these files indicated where these specific defaults came from. That's information I'd need to know to update these, or to choose whether or not to override those defaults in calls to these workflows from other repos.

Comment on lines 15 to 16
- Linking PRs to issues and keeping them in sync requires complex operations
- Work often begins in an Issue, but then is completed in the PR, requiring this synchronization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand these workflows and the GraphQL code very well, so apologies if this is an ignorant question... does any of this automation assume that 1 issue will be resolved by exactly 1 PR?

The language here (talking about "the" PR) makes me think that it might.

I think that's an assumption that's likely to be violated often, especially for large efforts like rapidsai/cuvs#735

Copy link
Contributor Author

@jarmak-nv jarmak-nv Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah totally agree it's a little confusing; for any given workflow-triggering PR the update-linked-issues workflow synchronizes all issues that are linked to that PR (assuming they are in the same project as the PR).

Ultimately, GitHub itself doesn't have a strong mechanism for delineating the 1-to-many or many-to-many relationship of PRs and Issues in their linkages. If you link 5 PRs to 1 issue, closing any of the PRs closes all of the issues.

The linkage between the two lives inside the PR, so I say "the PR" referring to the PR that triggers the workflow, but can totally see changing it or adding some clarity!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some text, lmk if it adds any clarity

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some text, lmk if it adds any clarity

If you link 5 PRs to 1 issue, closing any of the PRs closes all of the issues.

Thanks. I re-read the docs today and it's good to see they no longer imply a 1-to-1 relationship between PRs and issues.

I think the biggest thing I'm still really unsure about it what you mean (in comments here and in the docs) by "link to".

For example, if I leave a comment like "Contributes to #17963" in a PR's description or in a comment, I have "linked" the PR to the issue.. GitHub will generate an update on the corresponding issues like these:

image

(ref: microsoft/LightGBM#3867)

But those types of linkages don't result in the behavior "closing the PR closes the issue".

I suspect you mean specifically using one of these keywords: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue

If so, I'd add a link to that in the docs.

Comment on lines +165 to +179
if [[ "$CURRENT_DATE" < "$DEV_START" ]]; then
STAGE_ID="${{ inputs.STAGE_PLANNED_NAME }}"
echo "Stage: Planned"
elif [[ "$CURRENT_DATE" < "$BURNDOWN_START" ]]; then
STAGE_ID="${{ inputs.STAGE_DEVELOPMENT_NAME }}"
echo "Stage: Development"
elif [[ "$CURRENT_DATE" < "$FREEZE_START" ]]; then
STAGE_ID="${{ inputs.STAGE_BURNDOWN_NAME }}"
echo "Stage: Burndown"
elif [[ "$CURRENT_DATE" < "$RELEASE_DATE" ]]; then
STAGE_ID="${{ inputs.STAGE_CODE_FREEZE_NAME }}"
echo "Stage: Code Freeze"
else
STAGE_ID="${{ inputs.STAGE_HOTFIX_NAME }}"
echo "Stage: Hotfix"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move as many of the ${{ }} pre-processor lines as possible to environment variables to prevent script injection attacks.

That goes for this block and any other lines in this workflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair, if these stay as variables vs going hard-coded per James' comment above I'll update.

I don't think this is a vector for script injection though because the only way these variables get set is inside the action (they aren't affected by a user defined field in a workflow-triggering PR). If a bad actor attempted to set the values in their own branch, it wouldn't have an effect unless we merged it.

One of the frustrating parts of these workflows is that testing is challenging because they need to be merged into the default branch to be functional, but it does provide protection against script injection as a benefit in many cases!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants