Skip to content

Commit 7280ef4

Browse files
Copilotlpcox
authored andcommitted
feat: add skip-unset mode to one-shot token library with value preview (enabled by default) (#640)
* Initial plan * feat: add AWF_ONE_SHOT_SKIP_UNSET flag to skip clearing tokens - Add skip_unset flag controlled by AWF_ONE_SHOT_SKIP_UNSET env var - When enabled, tokens are logged but not cleared after access - Allow multiple reads of tokens in skip-unset mode - Update README with skip-unset mode documentation - Add tests for skip-unset mode in both container and chroot modes Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * test: verify skip-unset functionality works correctly - Built one-shot-token library successfully - Verified normal mode clears tokens after first access - Verified skip-unset mode preserves tokens and logs accesses - Unit tests pass (736/737) - Lint passes with no new errors Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: always strdup result in skip-unset mode for safety - Call strdup() on result in both normal and skip-unset modes - Prevents potential use-after-free if environment is modified - Applies to both first access and subsequent accesses in skip-unset mode - Addresses code review feedback about pointer safety Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * feat: enable skip-unset mode by default Set AWF_ONE_SHOT_SKIP_UNSET=1 in entrypoint.sh to skip variable unsetting while still logging token accesses. This applies to both container mode and chroot mode. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * feat: add token value preview to log messages Show first 4 characters of token value plus "..." in log messages to provide debugging context while protecting sensitive data. Updates both getenv and secure_getenv interceptors. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * feat: replace skip-unset with in-memory caching strategy Cache token values in memory on first access, then unset from environment. Subsequent getenv() calls return the cached value, allowing programs to read tokens multiple times while /proc/self/environ is cleared. Remove AWF_ONE_SHOT_SKIP_UNSET - caching makes it unnecessary. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
1 parent 836ea5c commit 7280ef4

4 files changed

Lines changed: 186 additions & 86 deletions

File tree

containers/agent/entrypoint.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -406,7 +406,8 @@ else
406406
# 2. gosu switches to awfuser (drops root privileges)
407407
# 3. exec replaces the current process with the user command
408408
#
409-
# Enable one-shot token protection to prevent tokens from being read multiple times
409+
# Enable one-shot token protection - tokens are cached in memory and
410+
# unset from the environment so /proc/self/environ is cleared
410411
export LD_PRELOAD=/usr/local/lib/one-shot-token.so
411412
exec capsh --drop=$CAPS_TO_DROP -- -c "exec gosu awfuser $(printf '%q ' "$@")"
412413
fi

containers/agent/one-shot-token/README.md

Lines changed: 22 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
## Overview
44

5-
The one-shot token library is an `LD_PRELOAD` shared library that provides **single-use access** to sensitive environment variables containing GitHub, OpenAI, Anthropic/Claude, and Codex API tokens. When a process reads a protected token via `getenv()`, the library returns the value once and immediately unsets the environment variable, preventing subsequent reads.
5+
The one-shot token library is an `LD_PRELOAD` shared library that provides **cached access** to sensitive environment variables containing GitHub, OpenAI, Anthropic/Claude, and Codex API tokens. When a process reads a protected token via `getenv()`, the library caches the value in memory and immediately unsets the environment variable. Subsequent `getenv()` calls return the cached value, allowing the process to read tokens multiple times while `/proc/self/environ` is cleared.
66

7-
This protects against malicious code that might attempt to exfiltrate tokens after the legitimate application has already consumed them.
7+
This protects against exfiltration via `/proc/self/environ` inspection while allowing legitimate multi-read access patterns that programs like the Copilot CLI require.
88

99
## Configuration
1010

@@ -78,7 +78,7 @@ Linux's dynamic linker (`ld.so`) supports an environment variable called `LD_PRE
7878
│ Application calls getenv("GITHUB_TOKEN"): │
7979
│ 1. Resolves to one-shot-token.so's getenv() │
8080
│ 2. We check if it's a sensitive token │
81-
│ 3. If yes: call real getenv(), copy value, unsetenv(), return │
81+
│ 3. If yes: cache value, unsetenv(), return cached value
8282
│ 4. If no: pass through to real getenv() │
8383
└─────────────────────────────────────────────────────────────────┘
8484
```
@@ -100,7 +100,7 @@ Second getenv("GITHUB_TOKEN") call:
100100
┌─────────────┐ ┌──────────────────┐
101101
│ Application │────→│ one-shot-token.so │
102102
│ │ │ │
103-
│ │←────│ Returns: NULL │ (token already accessed)
103+
│ │←────│ Returns: "ghp_..." │ (from in-memory cache)
104104
└─────────────┘ └──────────────────────┘
105105
```
106106

@@ -118,16 +118,17 @@ When `LD_PRELOAD=/usr/local/lib/one-shot-token.so` is set, the dynamic linker lo
118118

119119
We use `dlsym(RTLD_NEXT, "getenv")` to get a pointer to the **next** `getenv` in the symbol search order (libc's implementation). This allows us to:
120120
- Call the real `getenv()` to retrieve the actual value
121-
- Return that value to the caller
122-
- Then call `unsetenv()` to remove it from the environment
121+
- Cache the value in an in-memory array
122+
- Call `unsetenv()` to remove it from the environment (clears `/proc/self/environ`)
123+
- Return the cached value to the caller
123124

124-
### 3. State Tracking
125+
### 3. State Tracking and Caching
125126

126-
We maintain an array of flags (`token_accessed[]`) to track which tokens have been read. Once a token is marked as accessed, subsequent calls return `NULL` without consulting the environment.
127+
We maintain an array of flags (`token_accessed[]`) and a parallel cache array (`token_cache[]`). On first access, the token value is cached and the environment variable is unset. Subsequent calls return the cached value directly.
127128

128129
### 4. Memory Management
129130

130-
When we retrieve a token value, we `strdup()` it before calling `unsetenv()`. This is necessary because:
131+
When we retrieve a token value, we `strdup()` it into the cache before calling `unsetenv()`. This is necessary because:
131132
- `getenv()` returns a pointer to memory owned by the environment
132133
- `unsetenv()` invalidates that pointer
133134
- The caller expects a valid string, so we must copy it first
@@ -209,9 +210,9 @@ LD_PRELOAD=./one-shot-token.so ./test_getenv
209210
Expected output:
210211
```
211212
[one-shot-token] Initialized with 11 default token(s)
212-
[one-shot-token] Token GITHUB_TOKEN accessed and cleared
213+
[one-shot-token] Token GITHUB_TOKEN accessed and cached (value: test...)
213214
First read: test-token-12345
214-
Second read:
215+
Second read: test-token-12345
215216
```
216217

217218
### Custom Token Test
@@ -236,12 +237,12 @@ LD_PRELOAD=./one-shot-token.so bash -c '
236237
Expected output:
237238
```
238239
[one-shot-token] Initialized with 2 custom token(s) from AWF_ONE_SHOT_TOKENS
239-
[one-shot-token] Token MY_API_KEY accessed and cleared
240+
[one-shot-token] Token MY_API_KEY accessed and cached (value: secr...)
240241
First MY_API_KEY: secret-value-123
241-
Second MY_API_KEY:
242-
[one-shot-token] Token SECRET_TOKEN accessed and cleared
242+
Second MY_API_KEY: secret-value-123
243+
[one-shot-token] Token SECRET_TOKEN accessed and cached (value: anot...)
243244
First SECRET_TOKEN: another-secret
244-
Second SECRET_TOKEN:
245+
Second SECRET_TOKEN: another-secret
245246
```
246247

247248
### Integration with AWF
@@ -263,13 +264,14 @@ Note: The `AWF_ONE_SHOT_TOKENS` variable must be exported before running `awf` s
263264

264265
### What This Protects Against
265266

266-
- **Token reuse by injected code**: If malicious code runs after the legitimate application has read its token, it cannot retrieve the token again
267-
- **Token leakage via environment inspection**: Tools like `printenv` or reading `/proc/self/environ` will not show the token after first access
267+
- **Token leakage via environment inspection**: `/proc/self/environ` and tools like `printenv` (in the same process) will not show the token after first access — the environment variable is unset
268+
- **Token exfiltration via /proc**: Other processes reading `/proc/<pid>/environ` cannot see the token
268269

269270
### What This Does NOT Protect Against
270271

271-
- **Memory inspection**: The token exists in process memory (as the returned string)
272+
- **Memory inspection**: The token exists in process memory (in the cache array)
272273
- **Interception before first read**: If malicious code runs before the legitimate code reads the token, it gets the value
274+
- **In-process getenv() calls**: Since values are cached, any code in the same process can still call `getenv()` and get the cached token
273275
- **Static linking**: Programs statically linked with libc bypass LD_PRELOAD
274276
- **Direct syscalls**: Code that reads `/proc/self/environ` directly (without getenv) bypasses this protection
275277

@@ -279,13 +281,13 @@ This library is one layer in AWF's security model:
279281
1. **Network isolation**: iptables rules redirect traffic through Squid proxy
280282
2. **Domain allowlisting**: Squid blocks requests to non-allowed domains
281283
3. **Capability dropping**: CAP_NET_ADMIN is dropped to prevent iptables modification
282-
4. **One-shot tokens**: This library prevents token reuse
284+
4. **Token environment cleanup**: This library clears tokens from `/proc/self/environ` while caching for legitimate use
283285

284286
## Limitations
285287

286288
- **x86_64 Linux only**: The library is compiled for x86_64 Ubuntu
287289
- **glibc programs only**: Programs using musl libc or statically linked programs are not affected
288-
- **Single process**: Child processes inherit the LD_PRELOAD but have their own token state (each can read once)
290+
- **Single process**: Child processes inherit the LD_PRELOAD but have their own token state and cache (each starts fresh)
289291

290292
## Files
291293

containers/agent/one-shot-token/one-shot-token.c

Lines changed: 67 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,9 @@
22
* One-Shot Token LD_PRELOAD Library
33
*
44
* Intercepts getenv() calls for sensitive token environment variables.
5-
* On first access, returns the real value and immediately unsets the variable.
6-
* Subsequent calls return NULL, preventing token reuse by malicious code.
5+
* On first access, caches the value in memory and unsets from environment.
6+
* Subsequent calls return the cached value, so the process can read tokens
7+
* multiple times while /proc/self/environ no longer exposes them.
78
*
89
* Configuration:
910
* AWF_ONE_SHOT_TOKENS - Comma-separated list of token names to protect
@@ -53,6 +54,11 @@ static int num_tokens = 0;
5354
/* Track which tokens have been accessed (one flag per token) */
5455
static int token_accessed[MAX_TOKENS] = {0};
5556

57+
/* Cached token values - stored on first access so subsequent reads succeed
58+
* even after the variable is unset from the environment. This allows
59+
* /proc/self/environ to be cleaned while the process can still read tokens. */
60+
static char *token_cache[MAX_TOKENS] = {0};
61+
5662
/* Mutex for thread safety */
5763
static pthread_mutex_t token_mutex = PTHREAD_MUTEX_INITIALIZER;
5864

@@ -199,12 +205,43 @@ static int get_token_index(const char *name) {
199205
return -1;
200206
}
201207

208+
/**
209+
* Format token value for logging: show first 4 characters + "..."
210+
* Returns a static buffer (not thread-safe for the buffer, but safe for our use case
211+
* since we hold token_mutex when calling this)
212+
*/
213+
static const char *format_token_value(const char *value) {
214+
static char formatted[8]; /* "abcd..." + null terminator */
215+
216+
if (value == NULL) {
217+
return "NULL";
218+
}
219+
220+
size_t len = strlen(value);
221+
if (len == 0) {
222+
return "(empty)";
223+
}
224+
225+
if (len <= 4) {
226+
/* If 4 chars or less, just show it all with ... */
227+
snprintf(formatted, sizeof(formatted), "%s...", value);
228+
} else {
229+
/* Show first 4 chars + ... */
230+
snprintf(formatted, sizeof(formatted), "%.4s...", value);
231+
}
232+
233+
return formatted;
234+
}
235+
202236
/**
203237
* Intercepted getenv function
204238
*
205239
* For sensitive tokens:
206-
* - First call: returns the real value, then unsets the variable
207-
* - Subsequent calls: returns NULL
240+
* - First call: caches the value, unsets from environment, returns cached value
241+
* - Subsequent calls: returns the cached value from memory
242+
*
243+
* This clears tokens from /proc/self/environ while allowing the process
244+
* to read them multiple times via getenv().
208245
*
209246
* For all other variables: passes through to real getenv
210247
*/
@@ -226,30 +263,33 @@ char *getenv(const char *name) {
226263
return real_getenv(name);
227264
}
228265

229-
/* Sensitive token - handle one-shot access (mutex already held) */
266+
/* Sensitive token - handle cached access (mutex already held) */
230267
char *result = NULL;
231268

232269
if (!token_accessed[token_idx]) {
233-
/* First access - get the real value */
270+
/* First access - get the real value and cache it */
234271
result = real_getenv(name);
235272

236273
if (result != NULL) {
237-
/* Make a copy since unsetenv will invalidate the pointer */
274+
/* Cache the value so subsequent reads succeed after unsetenv */
238275
/* Note: This memory is intentionally never freed - it must persist
239-
* for the lifetime of the caller's use of the returned pointer */
240-
result = strdup(result);
276+
* for the lifetime of the process */
277+
token_cache[token_idx] = strdup(result);
241278

242-
/* Unset the variable so it can't be accessed again */
279+
/* Unset the variable from the environment so /proc/self/environ is cleared */
243280
unsetenv(name);
244281

245-
fprintf(stderr, "[one-shot-token] Token %s accessed and cleared\n", name);
282+
fprintf(stderr, "[one-shot-token] Token %s accessed and cached (value: %s)\n",
283+
name, format_token_value(token_cache[token_idx]));
284+
285+
result = token_cache[token_idx];
246286
}
247287

248288
/* Mark as accessed even if NULL (prevents repeated log messages) */
249289
token_accessed[token_idx] = 1;
250290
} else {
251-
/* Already accessed - return NULL */
252-
result = NULL;
291+
/* Already accessed - return cached value */
292+
result = token_cache[token_idx];
253293
}
254294

255295
pthread_mutex_unlock(&token_mutex);
@@ -261,11 +301,11 @@ char *getenv(const char *name) {
261301
* Intercepted secure_getenv function
262302
*
263303
* This function preserves secure_getenv semantics (returns NULL in privileged contexts)
264-
* while applying the same one-shot token protection as getenv.
304+
* while applying the same cached token protection as getenv.
265305
*
266306
* For sensitive tokens:
267-
* - First call: returns the real value (if not in privileged context), then unsets the variable
268-
* - Subsequent calls: returns NULL
307+
* - First call: caches the value, unsets from environment, returns cached value
308+
* - Subsequent calls: returns the cached value from memory
269309
*
270310
* For all other variables: passes through to real secure_getenv (or getenv if unavailable)
271311
*/
@@ -285,7 +325,7 @@ char *secure_getenv(const char *name) {
285325
return real_secure_getenv(name);
286326
}
287327

288-
/* Sensitive token - handle one-shot access with secure_getenv semantics */
328+
/* Sensitive token - handle cached access with secure_getenv semantics */
289329
pthread_mutex_lock(&token_mutex);
290330

291331
char *result = NULL;
@@ -295,22 +335,25 @@ char *secure_getenv(const char *name) {
295335
result = real_secure_getenv(name);
296336

297337
if (result != NULL) {
298-
/* Make a copy since unsetenv will invalidate the pointer */
338+
/* Cache the value so subsequent reads succeed after unsetenv */
299339
/* Note: This memory is intentionally never freed - it must persist
300-
* for the lifetime of the caller's use of the returned pointer */
301-
result = strdup(result);
340+
* for the lifetime of the process */
341+
token_cache[token_idx] = strdup(result);
302342

303-
/* Unset the variable so it can't be accessed again */
343+
/* Unset the variable from the environment so /proc/self/environ is cleared */
304344
unsetenv(name);
305345

306-
fprintf(stderr, "[one-shot-token] Token %s accessed and cleared (via secure_getenv)\n", name);
346+
fprintf(stderr, "[one-shot-token] Token %s accessed and cached (value: %s) (via secure_getenv)\n",
347+
name, format_token_value(token_cache[token_idx]));
348+
349+
result = token_cache[token_idx];
307350
}
308351

309352
/* Mark as accessed even if NULL (prevents repeated log messages) */
310353
token_accessed[token_idx] = 1;
311354
} else {
312-
/* Already accessed - return NULL */
313-
result = NULL;
355+
/* Already accessed - return cached value */
356+
result = token_cache[token_idx];
314357
}
315358

316359
pthread_mutex_unlock(&token_mutex);

0 commit comments

Comments
 (0)