Skip to content

chore(metrics): capture top referrers + paths weekly (durable attribution)#182

Merged
Yoojin-nam merged 1 commit into
mainfrom
feat/metrics-referrers-paths-capture
Jun 22, 2026
Merged

chore(metrics): capture top referrers + paths weekly (durable attribution)#182
Yoojin-nam merged 1 commit into
mainfrom
feat/metrics-referrers-paths-capture

Conversation

@Yoojin-nam

Copy link
Copy Markdown
Contributor

Capture top referrers + paths weekly (durable attribution)

Adds the one piece of adoption telemetry that was being lost: where each traffic wave comes from.

Why

GitHub's Traffic API exposes top referrers and top viewed paths only as a 14-day, admin-only,
top-10 point-in-time
view. That's the single most useful signal for "which channel drove this"
(Google / Reddit / chatgpt.com / a directory / arXiv) — and the first to vanish. metrics.yml already
logged view/clone counts but not the source, so referrer attribution evaporated every 14 days.

What this adds

  • metrics/referrers_log.csvdate, referrer, count, uniques
  • metrics/paths_log.csvdate, path, title, count, uniques

Both are long format, so a channel's trend pivots cleanly over weeks (e.g. is AEO traffic from
chatgpt.com / perplexity.ai compounding?). The weekly metrics.yml run now also appends to them.

  • Blank-tolerant: a 403 / no-PAT run resolves to [] → zero rows, header still written; the existing
    header is never duplicated. @csv quotes every field (comma-safe). No commit-loop (the workflow runs
    on schedule/dispatch only, with [skip ci]).
  • Seeded with the 2026-06-22 snapshot so the current wave is preserved immediately rather than
    waiting on the next cron — e.g. referrers Google 337/172, github.com 254/124, reddit.com 77/34,
    chatgpt.com 45/27, threads/linkedin/bing/perplexity/claude/facebook.
  • IMPACT.md updated to document the two new logs.

Verification

Capture logic tested against the live Traffic API (exact jq | @csv output); CSVs parse cleanly
(uniform columns). Adversarially reviewed: header guards always create both files before the commit
step (incl. the no-PAT path), jq emits zero rows on []/{}, no commit-loop, no PII (domains + repo
paths only). gen_distribution_manifest --check unaffected (metrics/ is not part of the install payload).

No skill / detector / version change.

…tion)

GitHub's Traffic API exposes top referrers and top viewed paths only as a 14-day,
admin-only, top-10 point-in-time view — the single most useful signal for "which
channel drove this wave" (Google / Reddit / chatgpt.com / a directory), and the
first to vanish. metrics.yml now also appends them to two long-format CSVs so a
channel's trend survives across weeks:

- metrics/referrers_log.csv — date, referrer, count, uniques
- metrics/paths_log.csv     — date, path, title, count, uniques

Blank-tolerant (default [] on a 403/no-PAT run -> zero rows, header still written;
existing header never duplicated); @csv quotes fields safely; the weekly commit step
adds both new logs. Seeded with the 2026-06-22 snapshot so the current wave is
preserved immediately rather than waiting on the next cron. IMPACT.md updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Yoojin-nam Yoojin-nam merged commit f8fe966 into main Jun 22, 2026
3 checks passed
@Yoojin-nam Yoojin-nam deleted the feat/metrics-referrers-paths-capture branch June 22, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant