Skip to content

Commit e3dfa5f

Browse files
docs: rewrite CLAUDE.md for v2.x architecture post-upstream-sync
Reflect the new reality after merging upstream v2.4.1: - Browser build is gone; TypeScript build produces dist/. - ESLint replaced with oxlint + prettier; husky/lint-staged pre-commit. - New auth: workload identity (AWS/Azure/GCP), PAT, OAuth authorization code/client credentials, auth coordinator, SPCS tokens. - New lib/minicore (NAPI Rust) with prebuilt binaries. - New lib/agent/crl_validator (CRL revocation). - New lib/telemetry, lib/disk_cache, lib/proxy_util. - Wiremock test harness for integration tests. - Node engine raised to >=18. Also document the fork-maintenance rules so the next upstream sync is less surprising: - The full peerDep set we maintain, and why each package is in devDependencies as well (build/test needs types). - The lazy-require discipline cloud-SDK callsites must follow so the optional peers can actually stay uninstalled at consumer install time. - The single divergence point (`declare module '@naturalcycles/snowflake-sdk'` in index.d.ts).
1 parent f1ae799 commit e3dfa5f

1 file changed

Lines changed: 69 additions & 42 deletions

File tree

CLAUDE.md

Lines changed: 69 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,73 +1,100 @@
11
## About this fork
22

3-
This is a fork of [`snowflake-connector-nodejs`](https://github.com/snowflakedb/snowflake-connector-nodejs) published as **`@naturalcycles/snowflake-sdk`**.
4-
The motivating change vs. upstream is that the heavy cloud-storage SDKs
5-
(`@aws-sdk/client-s3`, `@azure/storage-blob`, `@google-cloud/storage`, `asn1.js`) are declared as **optional `peerDependencies`** rather than hard dependencies — consumers who don't use a particular cloud's stage features don't have to install its SDK.
6-
See `package.json` `peerDependencies` / `peerDependenciesMeta`.
3+
This is a fork of [`snowflake-connector-nodejs`](https://github.com/snowflakedb/snowflake-connector-nodejs) published as **`@naturalcycles/snowflake-sdk`**.
4+
5+
The motivating change vs. upstream is that heavy cloud SDKs are declared as **optional `peerDependencies`** rather than hard `dependencies`. Consumers who don't use a particular cloud's stage or workload-identity features don't have to install its SDK. The current peer-dep set:
6+
7+
- AWS: `@aws-sdk/client-s3`, `@aws-sdk/client-sts`, `@aws-sdk/credential-provider-node`, `@aws-sdk/ec2-metadata-service`, `@aws-crypto/sha256-js`, `@smithy/node-http-handler`, `@smithy/protocol-http`, `@smithy/signature-v4`
8+
- Azure: `@azure/storage-blob`, `@azure/identity`
9+
- GCP: `google-auth-library`
10+
- Plus `asn1.js` (inherited from upstream's own optional peer)
11+
12+
These same packages are **also listed in `devDependencies`** so local dev/CI installs them and the TypeScript build (`npm run prepack`) can resolve their types. They are only optional for downstream consumers.
713

814
Two long-lived branches:
9-
- `master` tracks upstream Snowflake releases.
15+
- `master` tracks upstream Snowflake releases. Sync via the `upstream` remote (`https://github.com/snowflakedb/snowflake-connector-nodejs.git`). Merge `upstream/master` → local `master`, push, then merge `master``next`.
1016
- `next` is the active fork branch and the default target for PRs in this repo.
1117

12-
The TypeScript declaration in `index.d.ts` uses `declare module '@naturalcycles/snowflake-sdk'` — if you ever need to re-sync with upstream, that line is the canonical place where the module name diverges.
18+
Re-sync risks to watch for when merging upstream into `next`:
19+
- Upstream's `package.json` keeps adding hard cloud-SDK deps over time. Each sync must move new ones into `peerDependencies` (+ optional `peerDependenciesMeta` + duplicate in `devDependencies`).
20+
- Upstream writes new `.ts` files (e.g. `lib/authentication/auth_workload_identity/*.ts`, `lib/telemetry/platform_detection.ts`) with **static `import`** from cloud SDKs. The static imports compile cleanly because we have the SDKs in `devDependencies`, but at runtime a consumer who hasn't installed them will throw on first `require` of those modules. Any new code path that touches a peer SDK must be reachable *only* through a callsite that itself fails gracefully when the SDK is missing (e.g. workload-identity attestation is only called when that auth mode is selected).
21+
- The TS declaration in `index.d.ts` uses `declare module '@naturalcycles/snowflake-sdk'` — single divergence point for the module name. The file is copied verbatim into `dist/index.d.ts` by `ci/build_typescript.js`.
1322

1423
## Commands
1524

16-
Tests are run with mocha and a 180s timeout. Unit tests do not require Snowflake credentials; integration and system tests do (see README §Test for the env var list).
25+
Build (TypeScript → `dist/`):
1726

1827
```bash
19-
npm test # unit tests (same as test:unit)
20-
npm run test:unit # unit tests
21-
npm run test:integration # integration tests — needs SNOWFLAKE_TEST_* env vars
22-
npm run test:system # system tests
23-
npm run test:manual # interactive auth flows — needs RUN_MANUAL_TESTS_ONLY=true
24-
npm run test:ci # unit + integration (CI matrix)
25-
npm run test:ci:coverage # CI tests with nyc coverage
26-
27-
# Run a single test file (or a specific describe via -g)
28+
npm run prepack # tsc + copy index.d.ts + copy minicore binaries
29+
npm run check-ts # prepack then `tsc --noEmit dist/index.d.ts`
30+
```
31+
32+
Test (mocha, 180s timeout, runs both `.js` and `.ts` via `ts-node/register` from `.mocharc.js`):
33+
34+
```bash
35+
npm test # unit tests
36+
npm run test:unit # same as `npm test`
37+
npm run test:integration # integration — needs SNOWFLAKE_TEST_* env vars
38+
npm run test:authentication # auth flow tests
39+
npm run test:system # system tests
40+
npm run test:manual # interactive auth — needs RUN_MANUAL_TESTS_ONLY=true
41+
npm run test:ci # unit + integration combined
42+
npm run test:ci:coverage # CI tests under nyc
43+
44+
# Single test file (or filter via mocha's -g):
2845
npm run test:single -- test/unit/snowflake_test.js
2946
npm run test:single -- test/unit/snowflake_test.js -g 'pattern'
3047
```
3148

32-
Some integration tests expect a local hang/proxy webserver: `python3 ci/container/hang_webserver.py 12345 &`.
49+
A subset of integration tests requires `python3 ci/container/hang_webserver.py 12345 &` to be running, plus an active wiremock server (`npm run serve-wiremock` on port 8081) for the `test/integration/wiremock/*` cases.
3350

34-
Lint (ESLint + `check-dts` for the `.d.ts`):
51+
Lint / format (oxlint replaces ESLint; prettier handles formatting):
3552

3653
```bash
37-
npm run lint:check # eslint default (lib/) + check-dts index.d.ts
38-
npm run lint:check:all # lint lib + samples + system_test + test
39-
npm run lint:fix -- <path> # autofix a file/dir
54+
npm run lint:check # oxlint .
55+
npm run lint:fix # oxlint --fix .
56+
npm run prettier:check # prettier --check .
57+
npm run prettier:format # prettier -w .
4058
```
4159

42-
Pre-commit runs `snowflakedb/casec_precommit` (secret scanner) via `.pre-commit-config.yaml`.
60+
`lint-staged` runs `prettier:format` on all staged files and `oxlint --max-warnings=0` on `.js`/`.ts` via the `husky` pre-commit hook (`.husky/pre-commit`). The separate `snowflakedb/casec_precommit` secret-scanner pre-commit (`.pre-commit-config.yaml`) is opt-in via `pre-commit install`.
4361

4462
## Architecture
4563

46-
Entry points:
47-
- `index.js``lib/snowflake.js` (Node) — calls `core()` with `NodeHttpClient` and the Node logger.
48-
- `lib/browser.js` is the browser entry, wired with `lib/http/browser.js` and `lib/logger/browser.js`.
64+
**Entry points and build:**
65+
66+
- `lib/snowflake.ts` is the source entry. It calls `core()` (`lib/core.js`) with `NodeHttpClient` and the Node logger.
67+
- Root `index.js` re-exports `./lib/snowflake` (resolved by `ts-node` during dev/test).
68+
- The published package's `main` is `./dist/index.js`, generated by `ci/build_typescript.js` (clears `dist/`, runs `tsc`, copies `index.d.ts` and the minicore binaries). The browser build was removed in v2.x.
69+
- `tsconfig.json` has `allowJs: true` and `module: node16`, so `.ts` and `.js` files in `lib/` and `test/` are compiled together. `paths` maps `asn1.js` to a local type stub in `lib/types/asn1.js.d.ts` (asn1.js ships no types).
70+
71+
**`lib/core.js`** is the **factory** that returns the public API (`createConnection`, `createPool`, `configure`, type constants, error codes). It takes pluggable `httpClientClass` and `loggerClass` — historically used to provide a browser variant, now only Node, but the indirection remains.
4972

50-
`lib/core.js` is a **factory**: it takes `{ httpClientClass, loggerClass, client, … }` and returns the public API (`createConnection`, `createPool`, `configure`, `STRING/NUMBER/…` type constants, error codes). The same `core()` is used for both node and browser builds — that's why platform differences live in pluggable classes, not in `core.js`.
73+
**Layered structure under `lib/`:**
5174

52-
Layered structure under `lib/`:
75+
- **`connection/`**`Connection`, `ConnectionConfig`, `ConnectionContext`, `Statement`, bind uploading, result handling. A connection owns a `ConnectionContext` carrying config, HttpClient, and services. `normalize_connection_options.ts` and `types.ts` are the v2.x typed entry into option handling.
76+
- **`services/`**`sf.js` is the Snowflake session service (login, token refresh, query submission state machine). `large_result_set.js` downloads chunked S3/GCS result files.
77+
- **`authentication/`** — one module per auth type. Legacy (`.js`): `auth_default` (password), `auth_idtoken`, `auth_keypair` (JWT), `auth_oauth`, `auth_oauth_authorization_code`, `auth_oauth_pat`, `auth_okta`, `auth_web` (browser SSO). v2.x additions (`.ts`): `auth_oauth_client_credentials`, `auth_coordinator` (orchestrates token caching across pooled connections), `spcs_token` (Snowpark Container Services), and the `auth_workload_identity/` subtree (AWS / Azure / GCP attestation). `authentication.js` is the dispatcher; `secure_storage/json_credential_manager.js` is the default disk-backed token cache.
78+
- **`file_transfer_agent/`**`PUT` / `GET` stage upload-download. `s3_util.js` (S3, via `@aws-sdk/client-s3` + `@smithy/node-http-handler` for proxy), `azure_util.js` (Azure via `@azure/storage-blob`), `gcs_util.js` (GCS via REST + `google-auth-library` for credentials), `local_util.js` (local stages). Cloud SDKs **must** be loaded lazily (the long-standing pattern is `typeof s3 !== 'undefined' ? s3 : require('@aws-sdk/client-s3')` inside the function that needs it). New code paths that touch a peer SDK must keep this discipline or the optional-peer install will break at first call.
79+
- **`agent/`** — TLS layer. `https_ocsp_agent.js` + `ocsp_response_cache.js` enforce OCSP revocation. `https_proxy_agent.ts` (v2.x) handles outbound proxy and integrates with the new CRL validator. `crl_validator/` is a v2.x addition that fetches and verifies Certificate Revocation Lists, including RSASSA-PSS signature support (`rsassa_pss_parser.ts`). `socket_util.js` and `check.js` are shared helpers.
80+
- **`http/`**`base.js` (shared logic), `node.ts` (axios + OCSP/CRL agent), `node_untyped.js` (CJS shim), `axios_instance.ts` (single configured axios), `request_util.js` (retry, normalize response, GUID injection).
81+
- **`logger/`** — winston-based (`logger.ts` + `logger/node.js`). `easy_logging_starter.js` reads an external `client_config.json` for log-level/path overrides. `execution_timer.js`, `logging_util.js` are shared. Browser logger was removed.
82+
- **`configuration/`**`connection_configuration.js` loads from a TOML file (`connections.toml`) when `createConnection()` is called without options. `client_configuration.js` handles the easy-logging JSON file.
83+
- **`global_config.js`** — process-wide settings: `configure({ logLevel, ocspFailOpen/FailClosed/Insecure, customCredentialManager, ... })`. Mutates module state, so tests that touch it must restore it. `global_config_typed.ts` is the typed surface.
84+
- **`secret_detector.js`** — scrubs secrets out of log messages. Anything that logs request/response bodies should go through this.
85+
- **`queryContextCache.js`** — caches per-query context returned by the server to optimize subsequent statements.
86+
- **`disk_cache.ts`** (v2.x) — generic disk-backed cache with permission checks; used by OAuth/PAT token caches.
87+
- **`telemetry/`** (v2.x) — `inband_telemetry.ts` posts client telemetry to Snowflake. `platform_detection.ts`, `application_path.ts`, `libc_details.ts`, `os_details/` collect host info — `platform_detection` statically imports `@aws-sdk/client-sts` for ECS/EC2 attribution, so it's another path that needs the AWS SDK if invoked.
88+
- **`minicore/`** (v2.x) — NAPI Rust module (`rust_minicore/`) shipping prebuilt `.node` binaries for darwin/linux/win × arm64/x64. Used for crypto/parser hot paths. `index.ts` is the JS entry; `minicore.ts` wraps the platform-specific binary. Prebuilds are checked into `lib/minicore/binaries/` and copied to `dist/lib/minicore/binaries/` by the build script.
89+
- **`errors.js`** — central `ErrorCode` enum + `Errors.createClientError(...)`. The numeric codes are part of the public API surface (mirrored in `index.d.ts`); don't renumber them. `error_code.ts` is the typed re-export consumed by the `.d.ts`.
90+
- **`proxy_util.js`** (v2.x) — proxy resolution from connection config / env (`HTTPS_PROXY`, `NO_PROXY`), with per-destination overrides.
5391

54-
- **`connection/`**`Connection`, `ConnectionConfig`, `ConnectionContext`, `Statement`, bind-uploading and result handling. A connection owns a `ConnectionContext`, which carries the `ConnectionConfig`, an `HttpClient`, and the `services`.
55-
- **`services/`**`sf.js` is the Snowflake session service (login, token refresh, query submission state machine). `large_result_set.js` handles chunked S3/GCS result downloads.
56-
- **`authentication/`** — one module per auth type: `auth_default` (password), `auth_keypair` (JWT), `auth_oauth`, `auth_okta`, `auth_web` (browser SSO), `auth_idtoken`. `authentication.js` is the dispatcher. `secure_storage/json_credential_manager.js` is the default token cache.
57-
- **`file_transfer_agent/`** — implements `PUT`/`GET` (stage upload/download) against S3 (`s3_util.js`), Azure (`azure_util.js`), GCS (`gcs_util.js`), or local (`local_util.js`). The cloud SDKs are loaded lazily so the optional peerDep model works — only the path that's actually used needs its SDK installed. `encrypt_util.js`, `file_compression_type.js`, and `file_util.js` are shared helpers.
58-
- **`agent/`** — TLS / OCSP layer. `https_ocsp_agent.js` and `https_proxy_agent.js` extend Node's HTTPS agent to enforce OCSP revocation checking; `ocsp_response_cache.js` caches responses on disk. `cert_util.js` / `check.js` / `socket_util.js` support it.
59-
- **`http/`** — pluggable HTTP clients: `base.js` (shared), `node.js` (axios + OCSP agent), `browser.js`.
60-
- **`logger/`** — winston-based on Node, console-based in the browser. `easy_logging_starter.js` reads an external `client_config.json` for log-level/path overrides; `execution_timer.js` and `logging_utils.js` are shared.
61-
- **`configuration/`**`connection_configuration.js` loads connection params from a TOML file (`connections.toml`) when `createConnection()` is called without options. `client_configuration.js` handles the easy-logging JSON file.
62-
- **`global_config.js`** — process-wide settings: `configure({ logLevel, ocspFailOpen/FailClosed/Insecure, customCredentialManager, ... })`. Mutates module state, so tests that touch it should restore it.
63-
- **`secret_detector.js`** — scrubs secrets out of log messages before they're written. Anything that logs request/response bodies should go through this.
64-
- **`queryContextCache.js`** — caches query-context entries returned by the server to optimize subsequent statements.
65-
- **`errors.js`** — central `ErrorCode` enum + `Errors.createClientError(...)`. The numeric codes are the public API surface (mirrored in `index.d.ts`); don't renumber them.
92+
**Engine and language baseline:** Node ≥ 18 (`engines.node` in `package.json`; v1.x's Node-6 check is gone). New code goes in `.ts`; the `.ts`/`.js` boundary is fine to cross in either direction. Migration guidance is in the README's "TypeScript Migration" section.
6693

67-
Cross-cutting note: the codebase still supports Node ≥ 6.0.0 (checked at startup in `lib/snowflake.js`). That's why `lib/` is plain CommonJS with no async/await in older files and a lot of callback-style code. Newer modules use modern syntax, but be mindful when adding language features in shared utilities.
94+
**Tests:** mocha config is in `.mocharc.js` (`ts-node/register`, `extension: ['js','ts']`, `recursive: true`, retries enabled). Wiremock-backed tests live in `test/integration/wiremock/` and consume mappings from `wiremock/mappings/`. Many of the 11 currently-failing unit tests in a fresh checkout are infrastructure-dependent (need `hang_webserver.py` / fixture files) — they fail the same way on a clean `master`, so a clean-merge baseline is `~1001 passing, ~11 failing` until that local setup is in place.
6895

6996
## Code style
7097

71-
ESLint (`.eslintrc.js`) enforces: 2-space indent, single quotes, semicolons required, unix line endings, `eqeqeq` (with `null` exception), `camelCase`, `prefer-const`, `no-var`, `curly: all`, `no-console` (except `warn`/`error`; allowed everywhere in `samples/`). `space-before-function-paren` is `never` for named functions, `always` for anonymous and async-arrow.
98+
`oxlint` (`.oxlintrc.json`) is the linter; configuration is minimal — it's there as a fast gate, not as a strict style enforcer. **Formatting** is `prettier` (`.prettierrc.js`); run `npm run prettier:format` before committing. The pre-commit hook (`.husky/pre-commit` via `lint-staged`) runs both automatically on staged files.
7299

73-
There is also a `webstorm-codestyle.xml` for JetBrains users.
100+
For JetBrains users, `webstorm-codestyle.xml` is still in the repo.

0 commit comments

Comments
 (0)