Skip to content

Initial#149

Open
NeverTheSame wants to merge 42 commits into
sighingnow:masterfrom
NeverTheSame:gh-pages
Open

Initial#149
NeverTheSame wants to merge 42 commits into
sighingnow:masterfrom
NeverTheSame:gh-pages

Conversation

@NeverTheSame
Copy link
Copy Markdown

No description provided.

@NeverTheSame
Copy link
Copy Markdown
Author

Ok

NeverTheSame and others added 28 commits December 15, 2024 11:35
- Add comprehensive AGENTS.md for AI agent guidance
- Enhance README.md with project structure, Python tools setup, and detailed documentation
- Add blog posts about AGENTS.md and Nvidia Blackwell
- Update .gitignore to reflect posts-generator directory structure
Co-authored-by: kirillkuklin <kirillkuklin@gmail.com>
Co-authored-by: kirillkuklin <kirillkuklin@gmail.com>
Live-site audit found 24 broken internal links on the homepage. Root causes:

1. index.md hard-coded absolute paths like /devops/ and {{ post.url }},
   missing the /BeOps baseurl. Replaced with the relative_url filter so
   GH Pages serves them correctly.

2. Category index pages /devops/, /k8s/, /sre/, /ai/ did not exist,
   causing every post's category breadcrumb to 404. Added _pages/*.md
   for each category with permalink set and auto-listed posts.

Verified with a local Jekyll build + Playwright audit: 0 broken links,
0 missing-baseurl links, 0 category integrity issues.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds an OSS site audit using Playwright (+ optional lychee) under
tests/site-audit/. Detects:

- broken internal links
- links missing the /BeOps baseurl
- posts unreachable from the homepage in >2 clicks
- orphan posts
- category index pages that fail to list their own posts

Runs in GitHub Actions on every push to gh-pages/main and nightly
(.github/workflows/site-audit.yml), failing the build on regressions
and uploading the report as a workflow artifact.

AUTHORING.md + POST_TEMPLATE.md document the link-discipline rules
(always use the relative_url filter, never hard-code /devops/) and
the post checklist. AGENTS.md updated with the same rules.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…job-interviews category

Changes requested:

1. Sidebar (_includes/toc-date.html) now groups posts by category. Each
   category from _data/categories.yml is rendered as a top-level entry
   with its posts nested below in date-reverse order. Posts whose
   category doesn't match a known slug fall into an 'Other' bucket so
   nothing is silently dropped. Category index pages are no longer
   double-listed in the utility-pages section (skipped via the new
   is_category: true front matter).

2. Homepage (index.md) is now structured as 'Browse by Category':
   each category shows its description and its posts grouped beneath
   it, instead of a flat 'latest 5 posts' list.

3. 'Back to home' link added at the top of every post and category
   page via _layouts/post.html and _layouts/home.html (suppressed on
   the root homepage itself).

4. New category 'job-interviews' added — _data/categories.yml updated,
   _pages/job-interviews.md created, AGENTS.md + AUTHORING.md +
   POST_TEMPLATE.md + .github/workflows/site-audit.yml all updated to
   include the new slug. Future categories only need to be appended to
   _data/categories.yml (single source of truth for sidebar +
   homepage) plus a matching _pages/<slug>.md.

Verified locally with the site-audit framework: 0 broken links, 0
missing-baseurl, 0 unreachable/orphan/category issues.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
NeverTheSame and others added 13 commits May 25, 2026 09:54
- New 'interviews' category for published interview write-ups
- First post: first-person recap of a Staff AI Engineer loop covering
  pipeline, RAG, evaluation, safety/policy, memory, skills/MCP, and
  observability
- Wire the new category into _data/categories.yml, the audit CI env,
  and AGENTS.md so it shows up in the sidebar and the audit framework

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…adings

- Merge the redundant 'interviews' category into existing 'job-interviews'
  (the two were never meaningfully distinct).
- Move the Staff AI Engineer interview post into 'job-interviews'.
- Remove duplicate page-title h1s from category pages and homepage —
  the body.html include already renders one from page.title, so a
  leading '# Title' in the markdown produced two stacked headings.
- Broaden the job-interviews landing copy to cover both real interview
  write-ups and prep notes.
- Update _data/categories.yml, AGENTS.md, and the audit workflow env
  to drop the 'interviews' slug.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a gh-pages workflow that, on push touching _posts/**, regenerates
src/data/posts.ts in NeverTheSame/beops-main-site (newest-first, with
category mapping and word-count-based read time) and pushes the change
to main. Skips Sat/Sun in UTC.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restyles jekyll-gitbook with the beops.site palette (parchment / forest
green / warm brown), JetBrains Mono + Inter Tight, dashed dividers,
flat 2-4px radii, and a dark 'cabin' mode via prefers-color-scheme.
Layout (sidebar + topbar) is left untouched; this is colors + typography
+ component skin only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a free-tier RAG pipeline so the post generator can cite prior work
and stop repeating itself.

Stack (zero recurring cost):
- Neon serverless Postgres + pgvector (scale-to-zero)
- sentence-transformers/all-MiniLM-L6-v2 (384-dim, CPU, MIT)
- Hybrid retrieval: pgvector cosine (exact scan) + Postgres ts_rank_cd FTS
  + Reciprocal Rank Fusion (k=60) — all inside one SQL function

Components:
- posts-generator/rag/ package: chunker, embedder, db, indexer, retriever, cli
- posts-generator/tests/: unit (chunker, embedder, RRF) + integration
  (indexer idempotency / incremental / orphan-delete, retriever) +
  regression-blocking eval fixture (5 queries → expected top-K slugs)
- .github/workflows/rag-tests.yml: spins ephemeral Neon branch per CI run
  with 4h expires_at TTL so cancelled jobs can't leak past the 10-branch cap;
  concurrency group serializes per ref
- .github/workflows/rag-index.yml: incremental indexer on push to gh-pages
  touching _posts/**, guarded by GH Actions concurrency + PG advisory lock

Safety rails:
- rag_meta table pins embedding_model / chunker_version / schema_version;
  indexer aborts with clear error on drift
- Smoke test asserts BEOPS_RAG_ENABLED unset does NOT pull in
  sentence-transformers / torch / psycopg from the generator hot path
- Generator integration in openai_worker_4o.py is opt-in
  (BEOPS_RAG_ENABLED=1), default OFF; existing flows unchanged

gitignore: un-ignore source code under posts-generator/ so CI sees it;
keep configs/ logs/ produced_posts/ and the py-feedparser venv hidden.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When BEOPS_TEST_DATABASE_URL was set (CI path), the fixture took the
'return dsn' branch and pytest reported 'did not yield a value' because
the function had a yield elsewhere — making it a generator that never
yielded on that branch. Switch both paths to yield.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On a fresh DB the meta-version check ran before schema creation and
hit 'relation "rag_meta" does not exist'. init_schema is idempotent
so it's safe to call first; the version check then runs against a
populated rag_meta and aborts cleanly only on real drift.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
New docs/architecture.md with four mermaid diagrams covering:
- system overview (author -> GH -> Neon -> generator -> OpenAI)
- indexing sequence (chunk, embed, body_hash dedup, orphan delete)
- retrieval sequence (embed query -> search_chunks() -> prompt block)
- CI quality gate (ephemeral Neon branch, eval fixture regression gate)

README links to it from the Features section.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make the prior-work RAG layer part of the normal drafting flow rather
than an opt-in extra:

- openai_worker_4o.py: RAG runs whenever NEON_DATABASE_URL (or
  DATABASE_URL / BEOPS_TEST_DATABASE_URL) is in the environment.
  Disable with BEOPS_RAG_DISABLED=1 for one-off runs. Auto-skips
  cleanly (no error) when no DB is configured.
- rag/cli.py: new 'context <seed>' subcommand prints the same prompt
  block the Python worker injects, so any CLI agent (Copilot CLI,
  Claude Code, etc.) can consult prior coverage before drafting.
- AGENTS.md: documents the mandatory pre-drafting step and the
  one-off opt-out.
- smoke test contract strengthened: importing openai_worker_4o never
  pulls sentence-transformers / torch / psycopg regardless of env
  vars (heavy imports stay lazy inside the call site).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Lets an external agent ask 'what have I already written about X?'
without ever seeing NEON_DATABASE_URL. The secret stays inside
Actions; the prompt block returns as a workflow artifact + step
summary.

Usage:
  gh workflow run rag-query.yml -f seed='AI Engineering Coach repo'
  # wait, then:
  gh run download <run-id> -n rag-context

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
RAG-grounded draft (consulted rag-query workflow before drafting).
Cites prior posts: agentsmd-ate-my-readme, staff-ai-engineer-interview,
forward-deployed-engineering-ate-customer-success.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants