Track data about Pipelines and Github PRs

Some stats we might want to track:
 
* Time from Github PR opened to PR merged
* Time from Github PR opened to PR pipeline started
* Time from Github PR opened to PR pipeline succeeded
* Pipeline runtime

If we start tracking pipeline data, we might want to have a separate webhook that fires upon pipeline completion. We'll want to ingest the github data separately, since it's flaky, and we don't want to block a whole job based on that. We could have a celery task that runs on a cron, and looks for pipelines with missing github PR data, that we could attempt to ingest. If it fails because github is down, it will run again.

This is making me think that what we really want is a `Pipeline` dimension, that would contain information about a specific (PR) pipeline. Then, we could either have extra data on there pertaining to github PRs, or have a separate `PullRequest` dimension.

If we get to a point where we're storing lots of numeric data (that we're aggregating over), it would potentially warrant having a separate `Fact` table for pipelines, or whatever it is we're storing data about. Just something to keep in mind.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track data about Pipelines and Github PRs #1319

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Track data about Pipelines and Github PRs #1319

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions