feat: add DAL (CM-951)#3836
Conversation
b03cb7a to
425e4cb
Compare
bb118d5 to
f4cc9b8
Compare
|
|
3cbe8ec to
0af35ee
Compare
b5c3b03 to
7e6dc18
Compare
0af35ee to
82f29d9
Compare
There was a problem hiding this comment.
Pull request overview
Adds new Data Access Layer (DAL) modules for the project catalog and evaluated projects domain, and re-exports them from the DAL package entrypoint.
Changes:
- Introduces
project-catalogDAL: types plus CRUD-ish query helpers (find/insert/bulk insert/upsert/update/delete). - Introduces
evaluated-projectsDAL: types plus query helpers for evaluation lifecycle and bulk insert. - Exports both modules via
services/libs/data-access-layer/src/index.ts.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| services/libs/data-access-layer/src/project-catalog/types.ts | Defines DB-facing types for project catalog records and create/update payloads. |
| services/libs/data-access-layer/src/project-catalog/projectCatalog.ts | Adds SQL helpers for selecting/inserting/upserting/updating/deleting project catalog rows. |
| services/libs/data-access-layer/src/project-catalog/index.ts | Barrel export for the project-catalog module. |
| services/libs/data-access-layer/src/evaluated-projects/types.ts | Defines DB-facing types for evaluated projects and create/update payloads. |
| services/libs/data-access-layer/src/evaluated-projects/evaluatedProjects.ts | Adds SQL helpers for evaluated project operations (find/insert/bulk insert/update/mark evaluated/onboarded/delete). |
| services/libs/data-access-layer/src/evaluated-projects/index.ts | Barrel export for the evaluated-projects module. |
| services/libs/data-access-layer/src/index.ts | Re-exports the newly added modules from the DAL package entrypoint. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
services/libs/data-access-layer/src/evaluated-projects/evaluatedProjects.ts
Outdated
Show resolved
Hide resolved
services/libs/data-access-layer/src/evaluated-projects/types.ts
Outdated
Show resolved
Hide resolved
services/libs/data-access-layer/src/evaluated-projects/types.ts
Outdated
Show resolved
Hide resolved
services/libs/data-access-layer/src/evaluated-projects/evaluatedProjects.ts
Show resolved
Hide resolved
services/libs/data-access-layer/src/evaluated-projects/evaluatedProjects.ts
Outdated
Show resolved
Hide resolved
|
@ulemons please check cursor bugbot comments - I think some are pretty valid. |
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
fdf1177 to
e9852f1
Compare
|
|
||
| const log = getServiceLogger() | ||
|
|
||
| const DEFAULT_API_URL = 'https://hypervascular-nonduplicative-vern.ngrok-free.dev' |
There was a problem hiding this comment.
Ngrok development URL hardcoded as production default
High Severity
DEFAULT_API_URL is set to a temporary ngrok tunnel URL (https://hypervascular-nonduplicative-vern.ngrok-free.dev). If the LF_CRITICALITY_SCORE_API_URL environment variable is not configured, the LF criticality score source will attempt to connect to this ephemeral development endpoint. In production this will either fail outright (tunnel down) or connect to an unintended service.
|
|
||
| if (res.statusCode && (res.statusCode < 200 || res.statusCode >= 300)) { | ||
| reject(new Error(`HTTP ${res.statusCode} for ${url}`)) | ||
| return |
There was a problem hiding this comment.
HTTP responses not consumed on error or redirect
Low Severity
Both httpsGet and getHttpsStream reject or follow redirects without calling res.resume() on the response. The LF source's fetchPage correctly calls res.resume() on error (line 54), but these functions don't. Unconsumed HTTP responses prevent the underlying TCP socket from being released back to the agent's connection pool, which can lead to socket exhaustion during repeated bucket listing or when retries encounter errors.


Note
Medium Risk
Introduces a new Temporal ingestion path that streams large external datasets over HTTP and bulk upserts into Postgres, plus new DAL modules that will be used by other services. Risk is mainly around data correctness/performance and operational reliability (timeouts/retries, external API/bucket behavior).
Overview
Adds an Automatic Projects Discovery Temporal worker that discovers OSS repos from pluggable external sources and bulk upserts them into
projectCatalogin batches, with new activities (listSources,listDatasets,processDataset) and a workflow mode switch (incrementallatest-only vsfull).Introduces a source registry and two initial sources: OSSF Criticality Score (CSV snapshots from a public GCS bucket) and LF Criticality Score (paginated JSON API), including streaming + parsing/error propagation, and updates the Temporal schedule to run daily at midnight with a 2-hour workflow timeout.
Extends the data-access-layer with new
project-catalogandevaluated-projectsmodules (CRUD, bulk insert/upsert helpers, and types) and exports them from the DAL index; also adds thecsv-parsedependency and enables Postgres for the worker.Written by Cursor Bugbot for commit e9852f1. This will update automatically on new commits. Configure here.