Skip to content

Commit d5aac20

Browse files
committed
feat: implement TypeScript bindings for NVIDIA Management Library
Provides type-safe access to NVML for GPU monitoring, replacing inefficient nvidia-smi CLI calls with direct library access via Koffi FFI. Features: - Nvml namespace for library initialization and system queries - Device class with methods for memory, temperature, power, utilization, and process monitoring - Result<T> type for safe error handling without exceptions - Lazy function binding for optimal performance - Comprehensive type definitions for all NVML structures Includes: - Unit tests with NVML mocks (no GPU required) - API documentation and usage examples - ESLint and TypeScript configuration - GitHub Actions CI workflow --- Signed-off-by: Guillaume Moutier <guimou@users.noreply.github.com> Co-authored-by: Claude
1 parent f61feac commit d5aac20

31 files changed

Lines changed: 8239 additions & 0 deletions

.github/workflows/ci.yml

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main, dev]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
build-and-test:
11+
runs-on: ubuntu-latest
12+
13+
strategy:
14+
matrix:
15+
node-version: [22.x]
16+
17+
steps:
18+
- name: Checkout repository
19+
uses: actions/checkout@v4
20+
21+
- name: Setup Node.js ${{ matrix.node-version }}
22+
uses: actions/setup-node@v4
23+
with:
24+
node-version: ${{ matrix.node-version }}
25+
cache: 'npm'
26+
27+
- name: Install dependencies
28+
run: npm ci
29+
30+
- name: Run linter
31+
run: npm run lint
32+
33+
- name: Run type check
34+
run: npm run typecheck
35+
36+
- name: Build
37+
run: npm run build
38+
39+
- name: Run unit tests
40+
run: npm run test:unit
41+
42+
# Integration tests would require a GPU runner
43+
# integration-test:
44+
# runs-on: [self-hosted, gpu]
45+
# steps:
46+
# - uses: actions/checkout@v4
47+
# - uses: actions/setup-node@v4
48+
# with:
49+
# node-version: 22.x
50+
# - run: npm ci
51+
# - run: npm run build
52+
# - run: npm run test:integration

.gitignore

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Dependencies
2+
node_modules/
3+
4+
# Build output
5+
dist/
6+
7+
# Test coverage
8+
coverage/
9+
10+
# IDE and editors
11+
.idea/
12+
.vscode/
13+
*.swp
14+
*.swo
15+
*~
16+
17+
# OS files
18+
.DS_Store
19+
Thumbs.db
20+
21+
# Logs
22+
*.log
23+
npm-debug.log*
24+
yarn-debug.log*
25+
yarn-error.log*
26+
27+
# Environment files
28+
.env
29+
.env.local
30+
.env.*.local
31+
32+
# Temporary files
33+
tmp/
34+
temp/
35+
*.tmp
36+
37+
# Lock files (optional - uncomment if you want to ignore)
38+
# package-lock.json
39+
# yarn.lock
40+
41+
# TypeScript cache
42+
*.tsbuildinfo
43+
44+
# Claude
45+
.claude

CHANGELOG.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [0.1.0] - 2026-01-20
9+
10+
### Added
11+
12+
- **Nvml namespace** for library initialization and system queries
13+
- `init()` / `shutdown()` for lifecycle management
14+
- `getDriverVersion()` and `getNvmlVersion()` for version info
15+
- `getDeviceCount()` and `getDeviceByIndex()` for device enumeration
16+
- **Device class** with comprehensive GPU monitoring methods
17+
- Memory info (`getMemoryInfo()`)
18+
- Temperature (`getTemperature()`)
19+
- Power usage and limits (`getPowerUsage()`, `getPowerLimit()`)
20+
- GPU/memory utilization (`getUtilization()`)
21+
- Clock speeds (`getClockInfo()`)
22+
- P-state and compute mode (`getPState()`, `getComputeMode()`)
23+
- Running processes (`getRunningProcesses()`)
24+
- Device identification (`getName()`, `getUuid()`, `getSerial()`)
25+
- **Result<T> type** for safe error handling without exceptions
26+
- `ok()` / `err()` constructors
27+
- `unwrap()` / `unwrapOr()` helpers
28+
- `isOk()` / `isErr()` type guards
29+
- **Lazy function binding** for optimal startup performance
30+
- **Comprehensive type definitions** for all NVML structures and enums
31+
- **Unit tests** with NVML mocks (no GPU required to run)
32+
- **API documentation** with usage examples
33+
- **GitHub Actions CI** workflow for automated testing
34+
35+
[0.1.0]: https://github.com/rh-aiservices-bu/ts-nvml/releases/tag/v0.1.0

CLAUDE.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
ts-nvml is a TypeScript binding for NVIDIA Management Library (NVML). It provides direct access to GPU monitoring functions via Koffi FFI, replacing inefficient nvidia-smi CLI calls.
8+
9+
## Commands
10+
11+
```bash
12+
# Install dependencies
13+
npm install
14+
15+
# Build TypeScript
16+
npm run build
17+
18+
# Type check without emitting
19+
npm run typecheck
20+
21+
# Run tests
22+
npm test
23+
24+
# Run example (requires NVIDIA GPU)
25+
npx tsx examples/basic-usage.ts
26+
```
27+
28+
## Architecture
29+
30+
```
31+
src/
32+
├── index.ts # Public API exports
33+
├── nvml.ts # Nvml namespace (init, shutdown, system queries)
34+
├── device.ts # Device class with query methods
35+
├── types/
36+
│ ├── enums.ts # NvmlReturn, NvmlPState, NvmlComputeMode, etc.
37+
│ ├── structs.ts # MemoryInfo, GpuStatus, SystemSnapshot, etc.
38+
│ └── result.ts # Result<T> type and NvmlError
39+
├── bindings/
40+
│ ├── library.ts # Koffi library loading
41+
│ ├── types.ts # Koffi struct definitions (nvmlMemory_t, etc.)
42+
│ ├── init.ts # nvmlInit, nvmlShutdown bindings
43+
│ └── device.ts # Device query bindings (memory, temp, power, etc.)
44+
└── utils/
45+
└── library-path.ts # NVML library discovery
46+
```
47+
48+
## Key Patterns
49+
50+
### Result Type for Error Handling
51+
All query methods return `Result<T>` instead of throwing:
52+
```typescript
53+
const result = device.getMemoryInfo();
54+
if (result.ok) {
55+
console.log(result.value.total);
56+
} else {
57+
console.error(result.error.message);
58+
}
59+
60+
// Or use unwrap() to throw on error
61+
const memory = unwrap(device.getMemoryInfo());
62+
```
63+
64+
### Lazy Function Binding
65+
NVML functions are bound lazily on first use to avoid loading the library at import time:
66+
```typescript
67+
let _nvmlInit: NvmlFunc | null = null;
68+
function getNvmlInit(): NvmlFunc {
69+
if (!_nvmlInit) {
70+
_nvmlInit = getLibrary().func('int nvmlInit_v2()') as NvmlFunc;
71+
}
72+
return _nvmlInit;
73+
}
74+
```
75+
76+
### Koffi Output Parameters
77+
Primitive outputs use arrays, structs use objects:
78+
```typescript
79+
// Primitive: use array wrapper
80+
const count = [0];
81+
getDeviceGetCount()(count); // count[0] now has value
82+
83+
// Struct: use object
84+
const memory = { total: 0n, free: 0n, used: 0n };
85+
getDeviceGetMemoryInfo()(device, memory); // memory now populated
86+
```
87+
88+
## Testing
89+
90+
- Unit tests don't require a GPU (use mocks)
91+
- Integration tests require an NVIDIA GPU with drivers installed
92+
- The NVML library path can be overridden via `NVML_LIBRARY_PATH` env var
93+
94+
## License
95+
96+
Apache License 2.0

0 commit comments

Comments
 (0)