Skip to content

Commit 8b1745f

Browse files
committed
Legal Docs
1 parent 021f724 commit 8b1745f

2 files changed

Lines changed: 112 additions & 84 deletions

File tree

docs/BACKGROUND_OF_INVENTION.md

Lines changed: 28 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,57 +6,71 @@
66

77
## Field
88

9-
The present disclosure relates generally to **data loss prevention**, **monitoring of cleartext associated with outbound TLS operations on an endpoint**, and **policy-driven pattern matching** under resource and safety constraints. Particular embodiments concern **Linux** hosts using **extended Berkeley Packet Filter (eBPF)** facilities, **uprobe**-based attachment to **dynamic-linker–visible TLS library routines** (e.g., OpenSSL **`SSL_write`**), coordination between **kernel-side** capture or enforcement and **userspace** policy evaluation, and **consistent policy semantics** when similar inspection is performed at an **HTTP edge** after TLS termination.
9+
The present disclosure relates generally to **data loss prevention**, **monitoring of cleartext associated with outbound protected-channel operations on an endpoint**, and **policy-driven pattern matching** under resource and safety constraints.
10+
11+
Representative embodiments include:
12+
13+
- **Linux** hosts using **extended Berkeley Packet Filter (eBPF)** facilities, **uprobe**-based attachment to **dynamic-linker–visible** TLS library routines (e.g., OpenSSL **`SSL_write`**), coordination between **kernel-side** capture or enforcement and **userspace** policy evaluation, and **optional** in-kernel buffer actions subject to **verifier** constraints.
14+
15+
- **Portable userspace policy pipelines** that can execute on **multiple operating-system families** with a **common** rolling-buffer aggregation, **multi-pattern** matching (e.g., **Aho–Corasick**), deduplication, rate limiting, and structured alerting—so that **host** inspection and **HTTP-edge** inspection (after TLS termination elsewhere) can share **semantics** even when **kernel** capture mechanisms differ or are absent.
16+
17+
- **Complementary** enforcement strategies where **cross-process memory rewrite** is restricted (e.g., by vendor security policy or platform design): e.g., **network-path** or **socket-level** interruption in lieu of overwriting another process’s outbound buffer.
1018

1119
---
1220

1321
## Limitations of conventional approaches
1422

1523
**Network-centric inspection.** TLS payloads are typically **opaque** on the wire. Appliances that decrypt traffic often require **traffic to traverse** a designated proxy or key escrow architecture. Such designs may **not observe** flows that bypass the appliance, may **not cover** all host-originated egress, and may introduce **operational dependencies** (availability, latency, key management).
1624

17-
**Endpoint inspection without cryptographic boundary fidelity.** Instrumentation that applies only at the **application** layer may miss cleartext produced through **shared libraries** (e.g., `libssl.so`) when those paths are not uniformly wrapped. Approaches that depend on a **single** TLS API may fail to generalize across **dynamically linked** versus **statically linked** stacks.
25+
**Endpoint inspection without cryptographic boundary fidelity.** Instrumentation that applies only at the **application** layer may miss cleartext produced through **shared libraries** (e.g., `libssl.so` on Linux) when those paths are not uniformly wrapped. Approaches that depend on a **single** TLS API may fail to generalize across **dynamically linked** versus **statically linked** stacks.
1826

19-
**Fragmented cleartext.** Cleartext for a single logical message may be emitted through **multiple** write calls into a TLS stack. Inspection that operates **independently on each call** with **no cross-call context** can fail to detect sensitive substrings **split across chunk boundaries**—a situation that can arise with **small write sizes** or **multiplexed protocols**.
27+
**Fragmented cleartext.** Cleartext for a single logical message may be emitted through **multiple** write calls into a TLS stack. Inspection that operates **independently on each call** with **no cross-call context** can fail to detect sensitive substrings **split across chunk boundaries**—a situation that can arise with **small write sizes** or **multiplexed protocols** (e.g., HTTP/2).
2028

21-
**Kernel versus userspace policy richness.** **Kernel** programs are subject to **verifier** and **instruction-budget** constraints that limit arbitrary string matching and rule complexity. **Userspace** can host richer matchers (e.g., multi-pattern automata) but cannot, by itself, observe or modify **user buffers** in another process without a **kernel-assisted** mechanism. A **naive** split—full policy in userspace with **no** aligned kernel path—may complicate **optional** enforcement that must act **before** ciphertext is emitted.
29+
**Kernel versus userspace policy richness.** On platforms that support **eBPF**, **kernel** programs are subject to **verifier** and **instruction-budget** constraints that limit arbitrary string matching and rule complexity. **Userspace** can host richer matchers but cannot, by itself, observe or modify **user buffers** in another process without a **kernel-assisted** mechanism. A **naive** split—full policy in userspace with **no** aligned kernel path—may complicate **optional** enforcement that must act **before** ciphertext is emitted.
2230

23-
**Heterogeneous deployment.** Organizations may terminate TLS at a **load balancer** or **ingress** while still requiring **host-level** assurance. Maintaining **parallel, divergent** policy engines increases **drift** and operational cost.
31+
**OS-specific enforcement and capture.** **Linux** may expose **uprobes** and **ring buffers** suitable for surgical observation of **`SSL_write`**. Other general-purpose operating systems may **not** offer the same **eBPF** attachment model; **endpoint security frameworks**, **filtering platforms**, or **application-layer** inspection may be required instead—while operators still desire **one policy document** and **one matching semantics** across deployment targets.
32+
33+
**Heterogeneous deployment.** Organizations may terminate TLS at a **load balancer** or **ingress** while still requiring **host-level** assurance. They may also run **mixed** server and desktop estates. Maintaining **parallel, divergent** policy engines increases **drift** and operational cost.
2434

2535
---
2636

2737
## Technical challenges addressed by representative embodiments (code-informed)
2838

2939
The following challenges are motivated by **design tensions** reflected in a **host sensor** and **HTTP edge** architecture such as that in this repository. They are stated as **technical problems**, not as admissions of novelty or non-obviousness.
3040

31-
### 1. Discovering attach points for dynamically linked TLS stacks
41+
### 1. Discovering attach points for dynamically linked TLS stacks (Linux)
3242

33-
On Linux, processes that map **`libssl.so`** expose library paths via **`/proc/<pid>/maps`**. A **practical** attach strategy may **scan** process listings under a configurable **proc root**, resolve **`SSL_write`** (or equivalent) symbols, and attach **uprobes** **once per shared library inode** to avoid redundant probes as processes are created or replaced. A **periodic** discovery pass may **attach** to newly loaded libraries and **prune** state for processes that no longer exist.
43+
On Linux, processes that map **`libssl.so`** expose library paths via **`/proc/<pid>/maps`** (or an equivalent **proc root** in containerized layouts). A **practical** attach strategy may **scan** process listings under a configurable **proc root**, resolve **`SSL_write`** (or equivalent) symbols, and attach **uprobes** **once per shared library inode** to avoid redundant probes as processes are created or replaced. A **periodic** discovery pass may **attach** to newly loaded libraries and **prune** rolling-buffer state for processes that no longer exist.
3444

3545
### 2. Bounded kernel capture with userspace reassembly for full-policy matching
3646

37-
Kernel-side tracing may copy a **bounded** number of bytes per call from the **`SSL_write`** user buffer (e.g., on the order of **hundreds** of bytes per event) into an **eBPF ring buffer** for delivery to userspace, while tracking **metadata** such as process identifier and short command name. Because a **single** event may not contain an entire sensitive token, a **userspace** component may maintain a **rolling buffer** of recent cleartext **per process** (e.g., capped at a few **kilobytes**), **append** successive chunks, and run a **multi-pattern** matcher (e.g., **Aho–Corasick**) over the **aggregated** buffer so that patterns **spanning chunk boundaries** remain detectable.
47+
Kernel-side tracing may copy a **bounded** number of bytes per call from the **`SSL_write`** user buffer (e.g., on the order of **hundreds** of bytes per event) into an **eBPF ring buffer** for delivery to userspace, while tracking **metadata** such as process identifier and short command name. Because a **single** event may not contain an entire sensitive token, a **userspace** component may maintain a **rolling buffer** of recent cleartext **per process** (e.g., capped at a few **kilobytes**), **append** successive chunks, and run a **multi-pattern** matcher over the **aggregated** buffer so that patterns **spanning chunk boundaries** remain detectable.
3848

3949
### 3. Ring-buffer backpressure and observability
4050

4151
When the ring buffer cannot reserve space for an event, the system may increment a **per-CPU** or aggregate **drop** counter and expose the condition via **metrics** so operators can **size** buffers or reduce load. This addresses **lossy** capture under burst traffic.
4252

43-
### 4. Dual-path enforcement: full policy in userspace, constrained matcher in-kernel for optional blocking
53+
### 4. Dual-path enforcement: full policy in userspace, constrained matcher in-kernel for optional blocking (Linux)
4454

4555
**Full** policy documents may include **many** rules with **variable** pattern lengths and severities. A **kernel** program may support only a **subset** of **block**-eligible rules—e.g., a **small** maximum number of patterns and a **short** maximum pattern length—so that **substring search** and **buffer modification** remain within **verifier** limits. **Userspace** may evaluate **complete** policy (including **observe** rules) on the **rolling buffer**, while **kernel** maps may be **synchronized** from a selected subset of **block** rules when **kernel-side blocking** is enabled. **Optional** enforcement may overwrite bytes in the **user** outbound buffer (e.g., via **`bpf_probe_write_user`**) when a **short** in-kernel pattern match succeeds, with **iteration** bounded to satisfy the verifier.
4656

4757
### 5. Policy reload and synchronization
4858

49-
When policy is **reloaded** at runtime (e.g., via **signal**), **userspace** may **atomically** swap an immutable matcher while **re-syncing** kernel **block** maps so that **observe** behavior and **optional** kernel enforcement **track** the same policy revision where applicable.
59+
When policy is **reloaded** at runtime (e.g., via **signal** on platforms that support it), **userspace** may **atomically** swap an immutable matcher while **re-syncing** kernel **block** maps (where applicable) so that **observe** behavior and **optional** kernel enforcement **track** the same policy revision where supported.
60+
61+
### 6. Shared userspace pipeline across host targets
62+
63+
A **single** userspace module may implement **chunk ingestion** (`HandleChunk` or equivalent), rolling-buffer updates, policy **Scan**, allowlisting, deduplication, alert rate limits, redacted logging, and **enforcement-mode** metadata—whether cleartext chunks originate from a **Linux** ring buffer or from **future** platform capture (e.g., filtered socket path, endpoint-security write events). **Process-liveness** pruning for rolling state may use **OS-appropriate** PID enumeration (e.g., **procfs**-style directories, **process snapshot**, or **skip** when neither is available).
5064

51-
### 6. Edge path with shared policy semantics
65+
### 7. Edge path with shared policy semantics
5266

53-
Where TLS is **terminated** upstream, an **HTTP** service may buffer **request bodies** (subject to a **maximum** size), apply the **same** policy engine and **alert** semantics as the host sensor for **cleartext** bodies, and optionally **reverse-proxy** to an origin. This **reduces** policy **divergence** between **host-observed** TLS writes and **ingress-observed** HTTP content.
67+
Where TLS is **terminated** upstream, an **HTTP** service may buffer **request bodies** (subject to a **maximum** size), apply the **same** policy engine and **alert** semantics as the host sensor for **cleartext** bodies, and optionally **reverse-proxy** to an origin. This **reduces** policy **divergence** between **host-observed** TLS-related cleartext and **ingress-observed** HTTP content.
5468

5569
---
5670

5771
## Need for improvement
5872

59-
Accordingly, there exists a need for **improved** methods and systems that combine **kernel-assisted** capture (and **optional** constrained enforcement) with **userspace** **multi-chunk** policy evaluation; **efficient** attachment to **dynamic** TLS libraries on **Linux**; **observable** handling of **capture backpressure**; and **consistent** policy handling across **host** and **edge** inspection paths.
73+
Accordingly, there exists a need for **improved** methods and systems that combine, where available, **kernel-assisted** capture (and **optional** constrained enforcement) with **userspace** **multi-chunk** policy evaluation; **efficient** attachment to **dynamic** TLS libraries on **Linux**; **observable** handling of **capture backpressure**; **consistent** policy handling across **host** and **edge** inspection paths; and **portable** policy-evaluation building blocks that can be **fed** by **platform-appropriate** capture layers on **non-Linux** endpoints without forcing a **duplicate** rule engine.
6074

6175
---
6276

@@ -65,4 +79,4 @@ Accordingly, there exists a need for **improved** methods and systems that combi
6579
- Replace generic “related art” discussion with **citations** after search.
6680
- Align **Background** language with **claims** so as **not** to disclaim subject matter unintentionally.
6781
- Cross-reference **[`INVENTION_DISCLOSURE_OUTLINE_US.md`](INVENTION_DISCLOSURE_OUTLINE_US.md)** for detailed embodiment lists and **figures**.
68-
- **Repository mapping (non-exhaustive):** `main_linux.go`, `discover_linux.go`, `main_windows.go`, `main_darwin.go`, `cli.go`, `internal/sensorcore/`, `bpf/scrubber.c`, `block_bpf.go`, `bpf_load.go`, `internal/policyengine/`, `internal/rollbuf/`, `internal/policy/`, `cmd/spectral-edge/`.
82+
- **Repository mapping (non-exhaustive):** `main_linux.go`, `main_windows.go`, `main_darwin.go`, `discover_linux.go`, `cli.go`, `spectral_bpf_generate.go`, `bpf_load.go`, `block_bpf.go`, `bpf/scrubber.c`, `internal/sensorcore/`, `internal/policyengine/`, `internal/rollbuf/` (including OS-specific **PID** listing for prune), `internal/policy/`, `internal/allowlist/`, `internal/dedupe/`, `metrics.go`, `metrics_other.go`, `version.go`, `cmd/spectral-edge/`.

0 commit comments

Comments
 (0)