[chore] add codeowner activity report workflow#46015
[chore] add codeowner activity report workflow#46015ChrsMark merged 11 commits intoopen-telemetry:mainfrom
Conversation
|
Sample output from the dry run: Code owner activity reportPeriod: 2026-01-12 – 2026-02-11 Each code owner (individuals only) must have reviewed or replied to at least 80% of the PRs and issues where they were the code owner for the component. Components in scope: processor/filter, processor/k8sattributes, processor/resourcedetection, processor/transform, receiver/filelog, receiver/hostmetrics, receiver/prometheus (and PRs
PRs below threshold
|
|
@mx-psi here is an attempt to count reviews per codeowner of the specific components that are targeting stability. I suggest that we focus on those only for now so as to avoid extra noise by checking the vast list of Contrib's components :). I wonder if 80% target is quite high though specially when we timeframe in a monthly period. Happy to tune the targets/checks accordingly. Also I only have the report for PRs but we can expand to report issues' stats as well if we find this useful either now or enable it on a later iteration. |
|
Should the report measure the PRs created by the codeowner itself ? Looking into my values, most of the PRs was created by me and I cannot review it 😅 |
Yeap, that's correct :) . I updated the logic and sample output: fb9e0e1#diff-3005a22a150a86354b703948d26e08bbf1654ac09b994e6ae7124114580740b3R255 |
|
Updated output sample: Code owner activity reportPeriod: 2026-01-13 – 2026-02-12 Each code owner (individuals only) must have reviewed or replied to at least 80% of the PRs and issues where they were the code owner for the component. Components in scope: processor/filter, processor/k8sattributes, processor/resourcedetection, processor/transform, receiver/filelog, receiver/hostmetrics, receiver/prometheus (and PRs
PRs below threshold
|
Thanks for working on this!
Yeah that makes sense to me
I agree it is quite high, we can start with something lower (maybe to start smaller we could do 50%). Also, the target was meant to be per component and not specifically per codeowner. We could lower the percentage do something like
Let's start with PRs for now to keep this small and add issues later? |
We require at least 3 codeowners for components that aim to become Otherwise we require that: 75% of PRs per component get reviewed at least by one code-owner and that codeowners contribute at least 75%/number_of_codeowners of reviews to the component. |
|
New Sample output: Code owner activity reportPeriod: 2026-01-13 – 2026-02-12 We target at least 75% of each component's PRs to be reviewed by a code owner, and each code owner to respond to at least 75% / n of their requested PRs (n = number of code owners for that component) (at least 3 code owners for components aiming for stable). Components in scope: processor/filter, processor/k8sattributes, processor/resourcedetection, processor/transform, receiver/filelog, receiver/hostmetrics, receiver/prometheus (and Component PR review rate (75% target)
PRs (per code owner)
|
|
Hmmm, I feel like we're trying to do two things at once here. Identify inactive codeowners and assess how responsive a codeowners group is to their components. I'm looking at Prometheus receiver stats since that's where I keep a constant eye. If we look at Krajo, David, Anthony, and my responsiveness individually, none of us reaches the 80% threshold. In reality, if I see that other codeowners have already responded to an issue and I have nothing to add, I won't comment. To get this right, I think we need to split the concerns here. Use something different to identify inactive individuals, count codeowners as a group of people, and check how responsive the whole group is together. |
|
We don't target 80%, that was the initial plan :) We target at least 75% of each component's PRs to be reviewed by at least a code owner, and each code owner to respond to at least 75% / n of their requested PRs (n = number of code owners for that component) (at least 3 code owners for components aiming for stable). See the sample output at #46015 (comment). The idea here is that we first focus on the components (first table) ensuring that at least 75% of them are reviewed by at least one person. This ensures that people submitting PRs are getting reviews. Then we have the additional table as a reference to drill down more if needed and check if something can be fixed for a specific component to improve its stats and "health". Imagine that a component has 70% of reviews which is only coming from one or two codeowners. Having a look into the secondary table will provide the insight that an active third codeowner is required. Would that be reasonable @ArthurSens ? |
|
Ah gotcha, I think I was misinterpreting the tables. |
|
From live discussion on 2026-02-16: if we can exclude PRs that do not target solely one component (e.g. update-otel PRs and similar) that would lead to more useful numbers |
I have tuned the script to exclude PRs from https://github.com/apps/otelbot like #46156. I'm not sure if we can effectively catch PRs that are labeled with more than one component label but are not actually requiring code-owners' review. Also regarding the Here is an updated script output: Code owner activity reportPeriod: 2026-01-19 – 2026-02-18 We target at least 75% of each component's PRs to be reviewed by a code owner, and each code owner to respond to at least 75% / n of their requested PRs (n = number of code owners for that component) (at least 3 code owners for components aiming for stable). Components in scope: processor/filter, processor/k8sattributes, processor/resourcedetection, processor/transform, receiver/filelog, receiver/hostmetrics, receiver/prometheus (and Component PR review rate (75% target)
PRs (per code owner)
|
One thing that is easy to identify and does seem to have a big effect on numbers are renovatebot PRs that touch multiple components. For example, 13 out of the 56 PRs for the resource detection processor are renovatebot PRs, maybe 3 of those (#45669, #45507, #45959) can be attribute to the resource detection processor but the other 10 (which is 20% of all PRs on this period) cannot be attributed to the component. We could even exclude just all renovatebot PRs if "exclude renovatebot PRs that touch multiple components" is too hard to codify. Otherwise it feels like this is ready to try out |
|
Thank's @mx-psi! I think we can skip all renovate's PRs since usually maintainers merge those without necessarily waiting for code-owners to approve. Updated: bad1be6 I'm also suggesting we exclude draft PRs: 1955a17 Sample output: Code owner activity reportPeriod: 2026-01-20 – 2026-02-19 We target at least 75% of each component's PRs to be reviewed by a code owner, and each code owner to respond to at least 75% / n of their requested PRs (n = number of code owners for that component) (at least 3 code owners for components aiming for stable). Components in scope: processor/filter, processor/k8sattributes, processor/resourcedetection, processor/transform, receiver/filelog, receiver/hostmetrics, receiver/prometheus (and Component PR review rate (75% target)
PRs (per code owner)
|
|
@ChrsMark The |
mx-psi
left a comment
There was a problem hiding this comment.
I think we can merge this but I would like to do some cleanup before. I left a few comments, there's also the issues logic which is unused for now
| // Optional: set LIMIT (e.g. 10) to only process that many PRs and issues (for quick local runs) | ||
| const PROCESS_LIMIT = process.env.LIMIT ? parseInt(process.env.LIMIT, 10) : null; | ||
| const PROGRESS_INTERVAL = 5; |
| ]); | ||
| // Resourcedetection has sub-labels (e.g. processor/resourcedetection/internal/azure); include those too. | ||
| function isAllowedLabel(label) { | ||
| if (FOCUS_COMPONENT_LABELS.size === 0) return true; |
There was a problem hiding this comment.
This is never going to happen
| if (FOCUS_COMPONENT_LABELS.size === 0) return true; |
| /** Set to true to include the "Code owners below 5% activity (past 6 months)" section in the report. */ | ||
| const REPORT_LOW_ACTIVITY_6MO = false; |
There was a problem hiding this comment.
Let's get rid of this and related code
| /** Set to true to include the "Code owners below 5% activity (past 6 months)" section in the report. */ | |
| const REPORT_LOW_ACTIVITY_6MO = false; |
| async function getPrsInWindow(octokit, since, until) { | ||
| return searchIssuesAndPrs(octokit, 'is:pr', since, until); | ||
| } |
There was a problem hiding this comment.
I think we can get rid of this helper, we don't use it consistently. We can either use searchIssuesAndPrs or maybe just the chunked version
| let lowActivityMarkdown = ''; | ||
| if (REPORT_LOW_ACTIVITY_6MO) { | ||
| const prs6Mo = await getPrsInWindowChunked(octokit, lookbackData.sixMonthsAgo, lookbackData.midnightYesterday); | ||
| progress(`Fetched ${prs6Mo.length} PRs (6 months). Computing low-activity stats...`); | ||
| const { byCodeOwnerAndComponent: sixMonthPrStats } = await computePrStats(octokit, prs6Mo, labelToOwners, componentLabels, lookbackData.sixMonthsAgo, lookbackData.midnightYesterday, null); | ||
| lowActivityMarkdown = formatLowActivityCodeOwners(sixMonthPrStats); | ||
| } | ||
| // const issueStats = await computeIssueStats(octokit, issues, labelToOwners, componentLabels, lookbackData.thirtyDaysAgo, lookbackData.midnightYesterday); |
There was a problem hiding this comment.
| let lowActivityMarkdown = ''; | |
| if (REPORT_LOW_ACTIVITY_6MO) { | |
| const prs6Mo = await getPrsInWindowChunked(octokit, lookbackData.sixMonthsAgo, lookbackData.midnightYesterday); | |
| progress(`Fetched ${prs6Mo.length} PRs (6 months). Computing low-activity stats...`); | |
| const { byCodeOwnerAndComponent: sixMonthPrStats } = await computePrStats(octokit, prs6Mo, labelToOwners, componentLabels, lookbackData.sixMonthsAgo, lookbackData.midnightYesterday, null); | |
| lowActivityMarkdown = formatLowActivityCodeOwners(sixMonthPrStats); | |
| } | |
| // const issueStats = await computeIssueStats(octokit, issues, labelToOwners, componentLabels, lookbackData.thirtyDaysAgo, lookbackData.midnightYesterday); |
| // const issueStats = await computeIssueStats(octokit, issues, labelToOwners, componentLabels, lookbackData.thirtyDaysAgo, lookbackData.midnightYesterday); | ||
| const issueStats = {}; | ||
|
|
||
| const report = generateReport(prStats, componentPrStats, issueStats, lookbackData, lowActivityMarkdown); |
There was a problem hiding this comment.
| const report = generateReport(prStats, componentPrStats, issueStats, lookbackData, lowActivityMarkdown); | |
| const report = generateReport(prStats, componentPrStats, issueStats, lookbackData); |
| ...(lowActivityMarkdown | ||
| ? [ | ||
| ``, | ||
| `### Code owners below ${LOW_ACTIVITY_THRESHOLD_PCT}% activity (past 6 months)`, | ||
| ``, | ||
| lowActivityMarkdown, | ||
| ] | ||
| : []), | ||
| // collapsibleSection('Issues', formatTable(issueStats, 'issues')), |
There was a problem hiding this comment.
| ...(lowActivityMarkdown | |
| ? [ | |
| ``, | |
| `### Code owners below ${LOW_ACTIVITY_THRESHOLD_PCT}% activity (past 6 months)`, | |
| ``, | |
| lowActivityMarkdown, | |
| ] | |
| : []), | |
| // collapsibleSection('Issues', formatTable(issueStats, 'issues')), |
| return `${header}\n${sep}\n${body}\n`; | ||
| } | ||
|
|
||
| function generateReport(prStats, componentPrStats, issueStats, lookbackData, lowActivityMarkdown) { |
There was a problem hiding this comment.
| function generateReport(prStats, componentPrStats, issueStats, lookbackData, lowActivityMarkdown) { | |
| function generateReport(prStats, componentPrStats, issueStats, lookbackData) { |
I'm not sure if we can safely exclude them. Some/most of them should still require input from code-owners even if they are trivial. But in any case I don't think they can statistically influence that much the results. |
ea6fd73 to
16ee9dc
Compare
|
Fresh sample run: Code owner activity reportPeriod: 2026-01-25 – 2026-02-24 We target at least 75% of each component's PRs to be reviewed by a code owner, and each code owner to respond to at least 75% / n of their requested PRs (n = number of code owners for that component) (at least 3 code owners for components aiming for stable). Component PR review rate (75% target)
PRs (per code owner)
|
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
16ee9dc to
3faf04c
Compare
I have removed all the unused code and rebased/squashed. |
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
evan-bradley
left a comment
There was a problem hiding this comment.
Looks good to me at a high level, we can iterate later if needed. Really like that this is written in JS instead of shell, makes it much easier to read.
|
@open-telemetry/collector-contrib-approvers PTAL |
Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
|
Updated sample run: Code owner activity reportPeriod: 2026-01-31 – 2026-03-02 We target at least 75% of each component's PRs to be reviewed by a code owner, and each code owner to respond to at least 75% / n of their requested PRs (n = number of code owners for that component) (at least 3 code owners for components aiming for stable). Component PR review rate (75% target)
PRs (per code owner)
|
Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
|
|
||
| function genLookbackDates() { | ||
| const now = new Date(); | ||
| const midnightYesterday = new Date( |
There was a problem hiding this comment.
Would it make sense to shift the window a day or a couple days into the past? Looking at PRs created yesterday and seeing they haven't been reviewed yet seems a bit aggressive, especially at 2am when this script is scheduled to run. 😉
There was a problem hiding this comment.
Makes sense. I will shift the time window by 5 days -> [-35d,-5d]
Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
|
@mx-psi I figured out that I was using the wrong handles for the bots. Fixed them at
Updated sample run: Code owner activity reportPeriod: 2026-01-27 – 2026-02-26 (start date inclusive, end date exclusive; PRs are included if created on or after the start date and before the end date). We target at least 75% of each component's PRs to be reviewed by a code owner, and each code owner to respond to at least 75% / n of their requested PRs (n = number of code owners for that component) (at least 3 code owners for components aiming for stable). Component PR review rate (75% target)
PRs (per code owner)
|
Co-authored-by: Andrzej Stencel <andrzej.stencel@elastic.co>
|
I see this creates one issue for every run. That may make it a bit tricky to compare data between months. |
Wouldn't that end up being a very long issue? We follow the individual issue approach for the weekly reports: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aissue%20state%3Aopen%20label%3Areport. I think if people want to compare the raw historical data they can just grab the latest X tables and work with them as they wish. |
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR adds a workflow that reports code-owners' activity. The workflow only focuses on components that are listed for stability at open-telemetry#44130 since that's the main priority for now. We can expand the report for all components in contrib but that might be a bit noisy. Similar work is done in other SIGs, i.e open-telemetry/opentelemetry-js#5898 <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Related to open-telemetry/opentelemetry-collector#14107 <!--Describe what testing was performed and which tests were added.--> #### Testing Running locally with: ```console export DRY_RUN=1 export GITHUB_TOKEN=ghp_foobarzet node -e " const { Octokit } = require('@octokit/rest'); const script = require('./.github/workflows/scripts/generate-codeowners-activity.js'); const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN }); script({ github: { rest: octokit }, context: { payload: { repository: { owner: { login: 'open-telemetry' } } } } }); " ``` <!--Describe the documentation added.--> #### Documentation ~ <!--Please delete paragraphs that you did not use before submitting.--> #### AI Usage disclaimer **_The script of this PR is crafted mainly by using Cursor_** taking inspiration from the Weekly Report workflow script: [/.github/workflows/scripts/generate-weekly-report.js](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/d0cd8a546f274e3aadbc43cf5cb5d693633365e4/.github/workflows/scripts/generate-weekly-report.js#L4) --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com> Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com> Co-authored-by: Andrzej Stencel <andrzej.stencel@elastic.co>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR adds a workflow that reports code-owners' activity. The workflow only focuses on components that are listed for stability at open-telemetry#44130 since that's the main priority for now. We can expand the report for all components in contrib but that might be a bit noisy. Similar work is done in other SIGs, i.e open-telemetry/opentelemetry-js#5898 <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Related to open-telemetry/opentelemetry-collector#14107 <!--Describe what testing was performed and which tests were added.--> #### Testing Running locally with: ```console export DRY_RUN=1 export GITHUB_TOKEN=ghp_foobarzet node -e " const { Octokit } = require('@octokit/rest'); const script = require('./.github/workflows/scripts/generate-codeowners-activity.js'); const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN }); script({ github: { rest: octokit }, context: { payload: { repository: { owner: { login: 'open-telemetry' } } } } }); " ``` <!--Describe the documentation added.--> #### Documentation ~ <!--Please delete paragraphs that you did not use before submitting.--> #### AI Usage disclaimer **_The script of this PR is crafted mainly by using Cursor_** taking inspiration from the Weekly Report workflow script: [/.github/workflows/scripts/generate-weekly-report.js](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/d0cd8a546f274e3aadbc43cf5cb5d693633365e4/.github/workflows/scripts/generate-weekly-report.js#L4) --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com> Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com> Co-authored-by: Andrzej Stencel <andrzej.stencel@elastic.co>
Description
This PR adds a workflow that reports code-owners' activity. The workflow only focuses on components that are listed for stability at #44130 since that's the main priority for now. We can expand the report for all components in contrib but that might be a bit noisy.
Similar work is done in other SIGs, i.e open-telemetry/opentelemetry-js#5898
Link to tracking issue
Related to open-telemetry/opentelemetry-collector#14107
Testing
Running locally with:
Documentation
~
AI Usage disclaimer
The script of this PR is crafted mainly by using Cursor taking inspiration from the Weekly Report workflow script: /.github/workflows/scripts/generate-weekly-report.js