GitHub - omkarjaliparthi/urja-critic-case-study: Public case study for Urja Critic — cross-provider taste auditor by Insights by Omkar. Architecture, rubric design, public-domain corpus policy, build-vs-buy. No proprietary code.

Public case study for the Urja Critic — a cross-provider taste auditor inside the Urja Visual API. The LLM that grades is, by policy, never the same provider as the LLM that generated.

Part of a 4-product solo studio by Omkar Jaliparthi. Kriya (109+ endpoint commercial astronomy API · npm · PyPI · Go) · Urja Critic (you are here) · Netra (BFF-over-Kriya workbench · live) · Insights by Omkar (consumer SaaS · live).

TL;DR

The Urja Critic is the evaluator inside Urja, the Insights by Omkar Visual API. It scores generated artifacts — illustrations, glyphs, character renders, scene compositions — against weighted rubrics anchored to a public-domain (pre-1929) reference corpus. Its one defining design decision: the evaluator and the generator are different LLM providers, by design. The provider mapping is a one-line policy — judge.provider !== generator.provider. Same-family bias becomes an architectural impossibility, not a runtime concern.

This is rare in 2026. Most LLM-as-judge pipelines run a single provider on both sides, inheriting documented self-preference bias. The mainstream eval harnesses (LangSmith, Braintrust, OpenAI Evals, Anthropic Evals) treat cross-provider grading as opt-in glue, not a structural guarantee. The Critic encodes the guarantee into the type system, ships it behind a metered API, and integrates into CI as a pass/fail gate. It is a policy layer on top of any harness — not a competitor to one.

At a glance _{last verified 2026-05-04 · current: v0.8 — Spring 2026 launch}


Product	Cross-provider taste auditor · `POST /api/v1/critic/critique` · live at `urja.insightsbyomkar.com`
Coverage	4 rubric tracks (`character-figure`, `character-part`, `illustrate-glyph`, `ambient-effect`) · 2 reference sets at v1.0 (`great-dane`, `canine-animated`) · adding 1 rubric / 1 reference set per release wave
Architecture	Cross-provider by design · weighted-rubric · 0–10 anchored scales at 2 / 5 / 8 / 10 · severity-tagged failings with `proposedTargetValue` for auto-refine
Reference corpus	Public-domain pre-1929 line · text + landmark targets + image plates · explicit exclusion list of in-copyright modern authors
Integration	GitHub Actions (`lucky-critique.yml`) · PR-comment score table · pass/fail gate vs threshold · nightly regression sweep
Stack	Next.js 16 · App Router · TypeScript strict · scoped JWT auth · Upstash rate-limit · provider adapters behind a single interface
Status	v0.8 · Spring 2026 launch promo live · beta stability on the API surface · `urja-client` v0.2.0 published to npm
Parent business	Omkar's Holistic Services LLC (DBA Insights by Omkar) · formed May 2023

📑 Read the case study

Problem & users — why cross-provider eval, who needs it, what the incumbents miss
Architecture — generator → output → opposite-provider judge → rubric runner → score envelope → CI
Decision · cross-provider by design — the headline decision · why independence is structural, not optional
Decision · rubric design — weighted axes, anchored scales, calibration, parser contract
Decision · public-domain reference corpus — pre-1929 line · the copyright trap modern evaluators inherit
Build vs buy — vs LangSmith, Braintrust, OpenAI Evals, Anthropic Evals, naive LLM-as-judge, human-only review
CI integration — PR-comment scoring, pass/fail gates, the developer experience
Outcomes & lessons — what worked, what didn't, what's next
Worked example — end-to-end critique of a generated illustration · per-axis scores, evidence, cross-provider tag

Related artifacts

Live API · Urja · Pricing · Critic track docs
urja-client · npm — type-safe URL builders + React ThemeProvider for Urja (v0.2.0)
Kriya case study — sibling commercial astronomy API · 109+ endpoints from first principles
Parent SaaS case study — the consumer product that dogfoods both Urja and Kriya
Insights by Omkar — the consumer SaaS · Urja's first paying customer

Skills this project evidences

Product	Program	Engineering	Business
AI eval product design Rubric-as-product surface Anchored-scale calibration CI as developer experience Policy layer vs harness positioning	Cross-provider eval governance Reference-set versioning Stability tiers (experimental → beta → stable) Release-gating discipline (n=2) Provenance-clean training data	Provider-adapter interface design Cache key on full (rubric × ref × provider × output) tuple Type-system enforcement of provider non-identity Structured-output parser with severity tags Stateless scoped JWT + rate-limit edges	Build vs buy on eval orchestration Copyright-clean corpus as moat Cost discipline at eval time (sample vs gate) Pricing on metered `:use` scope Defensibility in front of customer legal

What's not in this repo

This is a write-up. Not a release.

No proprietary code. No source from lib/critic/. No route handlers. No prompt templates.
No real rubric weights. The illustrative weights in this case study are representative, not the ones running in production.
No engine internals. The cache-key composition, the parser's failure-recovery rules, and the provider-adapter implementations stay private.
No customer list. Use cases discussed here are hypothetical, not commitments.
No specific provider versions used internally beyond what's already on the public pricing page.

The architecture, the policy, the rubric grammar, and the build-vs-buy reasoning are public. The implementation is not. That separation is deliberate — what's defensible about the Critic is the policy, and the policy is what this case study is for.

Hiring Senior PM · AI Platforms · Founding Platform PM (Series B+)?
admin@insightsbyomkar.com · LinkedIn · GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TL;DR

At a glance _{last verified 2026-05-04 · current: v0.8 — Spring 2026 launch}

📑 Read the case study

Related artifacts

Skills this project evidences

What's not in this repo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

TL;DR

At a glance last verified 2026-05-04 · current: v0.8 — Spring 2026 launch

📑 Read the case study

Related artifacts

Skills this project evidences

What's not in this repo

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

At a glance _{last verified 2026-05-04 · current: v0.8 — Spring 2026 launch}

Packages