GCS Backend: Leader node consumes 14GB RAM with 290MB storage

# HashiCorp Vault GCS Backend - High Memory Consumption Investigation

## Issue Summary

**Problem**: Vault active/leader node consumes ~14GB of RAM while standby nodes use only ~30MB. This occurs with a relatively small dataset (~290MB in GCS storage) and persists across Vault versions.

**Environment**:
- Vault Version: 1.21.1
- Storage Backend: Google Cloud Storage (GCS)
- Platform: Google Kubernetes Engine (GKE)
- HA Configuration: 3-node cluster (1 active, 2 standby)
- Seal: GCP Cloud KMS (gcpckms)

## Vault Configuration

```hcl
ui = true
disable_mlock = true
api_addr = "https://vault.example.com"

listener "tcp" {
  tls_disable = 1
  address = "[::]:8200"
  cluster_address = "[::]:8201"
}

seal "gcpckms" {
  project     = "my-project"
  region      = "global"
  key_ring    = "vault-keys"
  crypto_key  = "vault-unseal"
}

service_registration "kubernetes" {}

storage "gcs" {
  bucket      = "vault-storage-bucket"
  ha_enabled  = "true"
  chunk_size  = "8192"  # Also tested with "512"
}
```

## Observed Behavior

### Memory Usage Pattern

| Node | Role | Memory Usage |
|------|------|--------------|
| vault-0 | active | ~14,000 Mi |
| vault-1 | standby | ~30 Mi |
| vault-2 | standby | ~30 Mi |

### Memory Growth During Startup

After a fresh pod restart, memory grows rapidly:
```
10:44:30 → 3,658 Mi
10:44:44 → 6,833 Mi
10:45:15 → 8,943 Mi
10:46:43 → 13,991 Mi
10:47:26 → 14,020 Mi (stabilized)
```

The leader consumes ~14GB regardless of which pod becomes active.

## Investigation Details

### 1. Storage Analysis

**GCS Bucket Size**:
```bash
$ gsutil du -sh gs://vault-storage-bucket
289.74 MiB
```

**Conclusion**: Storage is small (~290MB), cannot explain 14GB RAM usage.

### 2. KV Secrets Engine

**Total Secrets**: 3,573 secrets across all paths

```bash
$ ./vault-audit-maxversions.sh secret "" 5 20 audit
[INFO] Total secrets scanned: 3573
```

**Secrets Distribution**:
```
secretpath1/: 1
secretpath2/: 20
secretpath3/: 38 (with ~3000+ nested secrets)
secretpath4/: 2
secretpath5/: 4
secretpath6/: 1
```

**Max Versions Configuration**: Cleaned up secrets with unlimited versions (max_versions=0)

**Conclusion**: 3,573 secrets should not require 14GB RAM. Expected ~50-100MB for metadata index.

### 3. Token Analysis

**Initial State**: 1,027 active tokens
- 100% service tokens (persistent, stored in memory)
- ~92% orphan tokens
- TTL: 15-29 days
- Primary source: GitHub auth for CircleCI

**Actions Taken**:
1. Configured GitHub auth to use batch tokens: `vault write auth/github/config token_type=batch`
2. Ran token tidy: `vault write -force auth/token/tidy`
3. Revoked 518 GitHub service tokens
4. Final state: 253 tokens remaining

**Conclusion**: Token cleanup did not reduce memory. After pod restart, memory returned to 14GB.

### 4. Lease Analysis

**Auth Leases**:
```bash
$ vault list sys/leases/lookup/auth/gcp/login/
0 leases  # After cleanup
```

**GCP Secrets Engine Leases**: All 9 GCP secrets engines have 0 active leases
```
gcp/: 0
gcp-p1/: 0
gcp-p2/: 0
gcp-p3: 0
gcp-p4/: 0
gcp-p5/: 0
gcp-p6/: 0
gcp-p7/: 0
gcp-p8/: 0
```

**Conclusion**: No leases contributing to memory usage.

### 5. Identity Store

```bash
$ vault list identity/entity/id 2>/dev/null | wc -l
0

$ vault list identity/group/id 2>/dev/null | wc -l
0
```

**Conclusion**: Empty identity store, not a factor.

### 6. Policies

```bash
$ vault policy list | wc -l
39
```

**Conclusion**: 39 policies is negligible.

### 7. Audit Log

```bash
$ kubectl exec -n vault vault-2 -- ls -lah /vault/logs/
-rw------- vault vault 54.0K vault_audit_2.log
```

**Conclusion**: Audit log is tiny (54KB), not a factor.

### 8. Secrets Engines Mounted

```
Path                     Type         Description
cubbyhole/               cubbyhole    per-token private secret storage
gcp-p1/        gcp          
gcp-p2/         gcp          
gcp-p3/                gcp          
gcp-p4/             gcp          
gcp-p5/             gcp          
gcp-p6/            gcp          
gcp-p7/          gcp          
gcp-p8/    gcp          
gcp/                     gcp          
identity/                identity     identity store
secret/                  kv           KV v2
ssh-client-signer/       ssh          
ssh-github/              ssh          
sys/                     system       
```

### 9. Auth Methods

```
Path                   Type      Description
capture-ssh-access/    gcp       
gcp/                   gcp       
github/                github    
token/                 token     
```

## Remediation Attempts

### Attempt 1: Token Cleanup
- Converted GitHub auth to batch tokens
- Revoked 518 service tokens
- **Result**: No memory reduction

### Attempt 2: Lease Cleanup
- Revoked ~24,000 GCP auth leases (from previous session)
- **Result**: Temporary reduction, memory returned to 14GB after pod restart

### Attempt 3: GCS chunk_size Change
- Changed from `chunk_size = "512"` to `chunk_size = "8192"`
- Performed full cluster restart (scale 0 → 3)
- **Result**: No change, memory still reached 14GB

### Attempt 4: Max Versions Cleanup
- Identified secrets with unlimited versions (max_versions=0)
- Patched to max_versions=5 and destroyed old versions
- **Result**: GCS bucket size reduced slightly, no RAM impact

## Summary Table

| Component | Value | Expected RAM Impact | Actual RAM Impact |
|-----------|-------|---------------------|-------------------|
| GCS Storage | 290 MB | - | - |
| KV Secrets | 3,573 | ~50-100 MB | Unknown |
| Active Tokens | 253 | ~10-50 MB | Unknown |
| Auth Leases | 0 | 0 | 0 |
| GCP Engine Leases | 0 | 0 | 0 |
| Identity Entities | 0 | 0 | 0 |
| Identity Groups | 0 | 0 | 0 |
| Policies | 39 | Negligible | Negligible |
| Audit Log | 54 KB | Negligible | Negligible |
| **TOTAL EXPECTED** | | **< 500 MB** | **~14 GB** |

## Logs Analysis

Startup logs show normal mount initialization:
```
core: Initializing version history cache for core
core: loaded wrapping token key
core: successfully mounted: type=kv version="v0.25.0+builtin" path=secret/
core: successfully mounted: type=gcp version="v0.23.0+builtin" path=gcp-*/
```

Cluster warnings (not related to memory):
```
core.cluster-listener: no TLS config found for ALPN: ALPN=["req_fw_sb-act_v1"]
```


## GitHub Issue Template

```markdown
### Vault version
1.21.1

### Vault storage backend
GCS (Google Cloud Storage)

### Describe the bug
Active/leader Vault node consumes ~14GB RAM while standby nodes use only ~30MB. 
This occurs with a small dataset (~290MB in GCS, 3573 KV secrets, 253 tokens, 0 leases).

### To Reproduce
1. Deploy Vault 1.21.1 with GCS backend on Kubernetes (3-node HA)
2. Store ~3500 KV secrets
3. Observe leader memory consumption

### Expected behavior
Memory usage proportional to data size. With 290MB storage and 3573 secrets, 
expected RAM < 1GB, not 14GB.

### Environment
- Vault version: 1.21.1
- Storage backend: GCS
- Platform: GKE
- HA: 3 nodes
- Seal: gcpckms

### Additional context
- Issue persists across Vault versions (not version-specific)
- Memory grows rapidly at startup (3GB → 14GB in ~2 minutes)
- Cleanup of tokens, leases, and secrets has no impact
- chunk_size configuration change has no impact
```

---

**Investigation Date**: January 22, 2026
**Investigator**: Patrick Poulin
**Status**: Unresolved - Escalate to HashiCorp


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCS Backend: Leader node consumes 14GB RAM with 290MB storage #31712

HashiCorp Vault GCS Backend - High Memory Consumption Investigation

Issue Summary

Vault Configuration

Observed Behavior

Memory Usage Pattern

Memory Growth During Startup

Investigation Details

1. Storage Analysis

2. KV Secrets Engine

3. Token Analysis

4. Lease Analysis

5. Identity Store

6. Policies

7. Audit Log

8. Secrets Engines Mounted

9. Auth Methods

Remediation Attempts

Attempt 1: Token Cleanup

Attempt 2: Lease Cleanup

Attempt 3: GCS chunk_size Change

Attempt 4: Max Versions Cleanup

Summary Table

Logs Analysis

GitHub Issue Template

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Node	Role	Memory Usage
vault-0	active	~14,000 Mi
vault-1	standby	~30 Mi
vault-2	standby	~30 Mi

Component	Value	Expected RAM Impact	Actual RAM Impact
GCS Storage	290 MB	-	-
KV Secrets	3,573	~50-100 MB	Unknown
Active Tokens	253	~10-50 MB	Unknown
Auth Leases	0	0	0
GCP Engine Leases	0	0	0
Identity Entities	0	0	0
Identity Groups	0	0	0
Policies	39	Negligible	Negligible
Audit Log	54 KB	Negligible	Negligible
TOTAL EXPECTED		< 500 MB	~14 GB

GCS Backend: Leader node consumes 14GB RAM with 290MB storage #31712

Description

HashiCorp Vault GCS Backend - High Memory Consumption Investigation

Issue Summary

Vault Configuration

Observed Behavior

Memory Usage Pattern

Memory Growth During Startup

Investigation Details

1. Storage Analysis

2. KV Secrets Engine

3. Token Analysis

4. Lease Analysis

5. Identity Store

6. Policies

7. Audit Log

8. Secrets Engines Mounted

9. Auth Methods

Remediation Attempts

Attempt 1: Token Cleanup

Attempt 2: Lease Cleanup

Attempt 3: GCS chunk_size Change

Attempt 4: Max Versions Cleanup

Summary Table

Logs Analysis

GitHub Issue Template

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions