Skip to content

fix(etl): flatten org aggregate report before ingest#8

Open
ndriyo wants to merge 2 commits into
amgdy:mainfrom
ndriyo:fix/org-aggregate
Open

fix(etl): flatten org aggregate report before ingest#8
ndriyo wants to merge 2 commits into
amgdy:mainfrom
ndriyo:fix/org-aggregate

Conversation

@ndriyo

@ndriyo ndriyo commented Jun 24, 2026

Copy link
Copy Markdown

Summary

  • fetchOrgData() was parsing the org-aggregate NDJSON directly as CopilotAggregateRecord, skipping the day_totals[] envelope that GitHub's Copilot Usage Metrics API actually returns for organization-scoped aggregate reports.

  • The result was that record.day was undefined for every org-aggregate line. Drizzle then emitted DEFAULT for the day column in the fact_org_aggregate_daily insert, which violates the column's NOT NULL constraint and fails the entire ingest with:

    null value in column "day" of relation "fact_org_aggregate_daily"
    violates not-null constraint
    
  • The enterprise path (fetchEnterpriseAggregate()) already handles this correctly by routing NDJSON through flattenAggregateReport() before merging. This PR aligns the org path with the same helper. _orgLogin / _scope tagging that the old loop did manually is performed inside flattenAggregateReport, so it is preserved.

Test plan

  • Run an end-to-end sync against an enterprise with at least one org. Confirm fact_org_aggregate_daily populates with one row per (day, org, scope) and the PR-metrics dashboards (/pull-requests, /metrics) light up.
  • Existing tests in app/src/lib/github/__tests__/aggregate-teams.test.ts should still pass — the org-aggregate flow now uses the same flatten helper the test suite already exercises for the enterprise path.

🤖 Generated with Claude Code

The org aggregate path in fetchOrgData() cast each NDJSON line directly to
CopilotAggregateRecord, skipping the day_totals envelope GitHub returns.
That left record.day undefined, so the insert into fact_org_aggregate_daily
emitted DEFAULT for the day column and failed the NOT NULL constraint:

  null value in column "day" of relation "fact_org_aggregate_daily"
  violates not-null constraint

Route the NDJSON lines through flattenAggregateReport() exactly like
fetchEnterpriseAggregate() already does. The helper expands day_totals[]
into one record per day and tags each with _orgLogin / _scope.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes org-scoped aggregate report ingestion by correctly flattening the Copilot Usage Metrics API’s org aggregate NDJSON “day_totals[]” envelope into per-day CopilotAggregateRecords before ETL ingest, aligning behavior with the already-correct enterprise aggregate path.

Changes:

  • Parse org aggregate NDJSON as AggregateReportLine (envelope form) instead of CopilotAggregateRecord.
  • Flatten org aggregate lines via flattenAggregateReport(..., "organization", orgLogin) to produce per-day records with _scope / _orgLogin tagging.

Comment on lines 717 to 722
const fileResponse = await fetch(link);
apiRequestCount++;
if (fileResponse.ok) {
const content = await fileResponse.text();
const parsed = parseNdjson<CopilotAggregateRecord>(content);
for (const r of parsed) {
r._orgLogin = opts.orgLogin;
r._scope = "organization";
}
aggregateRecords.push(...parsed);
lines.push(...parseNdjson<AggregateReportLine>(content));
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Mirrored the enterprise pattern — now logs a console.warn with the failed download's status/statusText (including the org login for context) and continues to the next link instead of silently skipping.

…skipping

Mirrors the enterprise aggregate path so partial ingests surface a diagnostic
signal. Addresses PR amgdy#8 review feedback.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants