You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/BACKGROUND_OF_INVENTION.md
+28-14Lines changed: 28 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,57 +6,71 @@
6
6
7
7
## Field
8
8
9
-
The present disclosure relates generally to **data loss prevention**, **monitoring of cleartext associated with outbound TLS operations on an endpoint**, and **policy-driven pattern matching** under resource and safety constraints. Particular embodiments concern **Linux** hosts using **extended Berkeley Packet Filter (eBPF)** facilities, **uprobe**-based attachment to **dynamic-linker–visible TLS library routines** (e.g., OpenSSL **`SSL_write`**), coordination between **kernel-side** capture or enforcement and **userspace** policy evaluation, and **consistent policy semantics** when similar inspection is performed at an **HTTP edge** after TLS termination.
9
+
The present disclosure relates generally to **data loss prevention**, **monitoring of cleartext associated with outbound protected-channel operations on an endpoint**, and **policy-driven pattern matching** under resource and safety constraints.
10
+
11
+
Representative embodiments include:
12
+
13
+
-**Linux** hosts using **extended Berkeley Packet Filter (eBPF)** facilities, **uprobe**-based attachment to **dynamic-linker–visible** TLS library routines (e.g., OpenSSL **`SSL_write`**), coordination between **kernel-side** capture or enforcement and **userspace** policy evaluation, and **optional** in-kernel buffer actions subject to **verifier** constraints.
14
+
15
+
-**Portable userspace policy pipelines** that can execute on **multiple operating-system families** with a **common** rolling-buffer aggregation, **multi-pattern** matching (e.g., **Aho–Corasick**), deduplication, rate limiting, and structured alerting—so that **host** inspection and **HTTP-edge** inspection (after TLS termination elsewhere) can share **semantics** even when **kernel** capture mechanisms differ or are absent.
16
+
17
+
-**Complementary** enforcement strategies where **cross-process memory rewrite** is restricted (e.g., by vendor security policy or platform design): e.g., **network-path** or **socket-level** interruption in lieu of overwriting another process’s outbound buffer.
10
18
11
19
---
12
20
13
21
## Limitations of conventional approaches
14
22
15
23
**Network-centric inspection.** TLS payloads are typically **opaque** on the wire. Appliances that decrypt traffic often require **traffic to traverse** a designated proxy or key escrow architecture. Such designs may **not observe** flows that bypass the appliance, may **not cover** all host-originated egress, and may introduce **operational dependencies** (availability, latency, key management).
16
24
17
-
**Endpoint inspection without cryptographic boundary fidelity.** Instrumentation that applies only at the **application** layer may miss cleartext produced through **shared libraries** (e.g., `libssl.so`) when those paths are not uniformly wrapped. Approaches that depend on a **single** TLS API may fail to generalize across **dynamically linked** versus **statically linked** stacks.
25
+
**Endpoint inspection without cryptographic boundary fidelity.** Instrumentation that applies only at the **application** layer may miss cleartext produced through **shared libraries** (e.g., `libssl.so` on Linux) when those paths are not uniformly wrapped. Approaches that depend on a **single** TLS API may fail to generalize across **dynamically linked** versus **statically linked** stacks.
18
26
19
-
**Fragmented cleartext.** Cleartext for a single logical message may be emitted through **multiple** write calls into a TLS stack. Inspection that operates **independently on each call** with **no cross-call context** can fail to detect sensitive substrings **split across chunk boundaries**—a situation that can arise with **small write sizes** or **multiplexed protocols**.
27
+
**Fragmented cleartext.** Cleartext for a single logical message may be emitted through **multiple** write calls into a TLS stack. Inspection that operates **independently on each call** with **no cross-call context** can fail to detect sensitive substrings **split across chunk boundaries**—a situation that can arise with **small write sizes** or **multiplexed protocols** (e.g., HTTP/2).
20
28
21
-
**Kernel versus userspace policy richness.****Kernel**programs are subject to **verifier** and **instruction-budget** constraints that limit arbitrary string matching and rule complexity. **Userspace** can host richer matchers (e.g., multi-pattern automata) but cannot, by itself, observe or modify **user buffers** in another process without a **kernel-assisted** mechanism. A **naive** split—full policy in userspace with **no** aligned kernel path—may complicate **optional** enforcement that must act **before** ciphertext is emitted.
29
+
**Kernel versus userspace policy richness.**On platforms that support **eBPF**, **kernel**programs are subject to **verifier** and **instruction-budget** constraints that limit arbitrary string matching and rule complexity. **Userspace** can host richer matchers but cannot, by itself, observe or modify **user buffers** in another process without a **kernel-assisted** mechanism. A **naive** split—full policy in userspace with **no** aligned kernel path—may complicate **optional** enforcement that must act **before** ciphertext is emitted.
22
30
23
-
**Heterogeneous deployment.** Organizations may terminate TLS at a **load balancer** or **ingress** while still requiring **host-level** assurance. Maintaining **parallel, divergent** policy engines increases **drift** and operational cost.
31
+
**OS-specific enforcement and capture.****Linux** may expose **uprobes** and **ring buffers** suitable for surgical observation of **`SSL_write`**. Other general-purpose operating systems may **not** offer the same **eBPF** attachment model; **endpoint security frameworks**, **filtering platforms**, or **application-layer** inspection may be required instead—while operators still desire **one policy document** and **one matching semantics** across deployment targets.
32
+
33
+
**Heterogeneous deployment.** Organizations may terminate TLS at a **load balancer** or **ingress** while still requiring **host-level** assurance. They may also run **mixed** server and desktop estates. Maintaining **parallel, divergent** policy engines increases **drift** and operational cost.
24
34
25
35
---
26
36
27
37
## Technical challenges addressed by representative embodiments (code-informed)
28
38
29
39
The following challenges are motivated by **design tensions** reflected in a **host sensor** and **HTTP edge** architecture such as that in this repository. They are stated as **technical problems**, not as admissions of novelty or non-obviousness.
30
40
31
-
### 1. Discovering attach points for dynamically linked TLS stacks
On Linux, processes that map **`libssl.so`** expose library paths via **`/proc/<pid>/maps`**. A **practical** attach strategy may **scan** process listings under a configurable **proc root**, resolve **`SSL_write`** (or equivalent) symbols, and attach **uprobes****once per shared library inode** to avoid redundant probes as processes are created or replaced. A **periodic** discovery pass may **attach** to newly loaded libraries and **prune** state for processes that no longer exist.
43
+
On Linux, processes that map **`libssl.so`** expose library paths via **`/proc/<pid>/maps`** (or an equivalent **proc root** in containerized layouts). A **practical** attach strategy may **scan** process listings under a configurable **proc root**, resolve **`SSL_write`** (or equivalent) symbols, and attach **uprobes****once per shared library inode** to avoid redundant probes as processes are created or replaced. A **periodic** discovery pass may **attach** to newly loaded libraries and **prune** rolling-buffer state for processes that no longer exist.
34
44
35
45
### 2. Bounded kernel capture with userspace reassembly for full-policy matching
36
46
37
-
Kernel-side tracing may copy a **bounded** number of bytes per call from the **`SSL_write`** user buffer (e.g., on the order of **hundreds** of bytes per event) into an **eBPF ring buffer** for delivery to userspace, while tracking **metadata** such as process identifier and short command name. Because a **single** event may not contain an entire sensitive token, a **userspace** component may maintain a **rolling buffer** of recent cleartext **per process** (e.g., capped at a few **kilobytes**), **append** successive chunks, and run a **multi-pattern** matcher (e.g., **Aho–Corasick**) over the **aggregated** buffer so that patterns **spanning chunk boundaries** remain detectable.
47
+
Kernel-side tracing may copy a **bounded** number of bytes per call from the **`SSL_write`** user buffer (e.g., on the order of **hundreds** of bytes per event) into an **eBPF ring buffer** for delivery to userspace, while tracking **metadata** such as process identifier and short command name. Because a **single** event may not contain an entire sensitive token, a **userspace** component may maintain a **rolling buffer** of recent cleartext **per process** (e.g., capped at a few **kilobytes**), **append** successive chunks, and run a **multi-pattern** matcher over the **aggregated** buffer so that patterns **spanning chunk boundaries** remain detectable.
38
48
39
49
### 3. Ring-buffer backpressure and observability
40
50
41
51
When the ring buffer cannot reserve space for an event, the system may increment a **per-CPU** or aggregate **drop** counter and expose the condition via **metrics** so operators can **size** buffers or reduce load. This addresses **lossy** capture under burst traffic.
42
52
43
-
### 4. Dual-path enforcement: full policy in userspace, constrained matcher in-kernel for optional blocking
53
+
### 4. Dual-path enforcement: full policy in userspace, constrained matcher in-kernel for optional blocking (Linux)
44
54
45
55
**Full** policy documents may include **many** rules with **variable** pattern lengths and severities. A **kernel** program may support only a **subset** of **block**-eligible rules—e.g., a **small** maximum number of patterns and a **short** maximum pattern length—so that **substring search** and **buffer modification** remain within **verifier** limits. **Userspace** may evaluate **complete** policy (including **observe** rules) on the **rolling buffer**, while **kernel** maps may be **synchronized** from a selected subset of **block** rules when **kernel-side blocking** is enabled. **Optional** enforcement may overwrite bytes in the **user** outbound buffer (e.g., via **`bpf_probe_write_user`**) when a **short** in-kernel pattern match succeeds, with **iteration** bounded to satisfy the verifier.
46
56
47
57
### 5. Policy reload and synchronization
48
58
49
-
When policy is **reloaded** at runtime (e.g., via **signal**), **userspace** may **atomically** swap an immutable matcher while **re-syncing** kernel **block** maps so that **observe** behavior and **optional** kernel enforcement **track** the same policy revision where applicable.
59
+
When policy is **reloaded** at runtime (e.g., via **signal** on platforms that support it), **userspace** may **atomically** swap an immutable matcher while **re-syncing** kernel **block** maps (where applicable) so that **observe** behavior and **optional** kernel enforcement **track** the same policy revision where supported.
60
+
61
+
### 6. Shared userspace pipeline across host targets
62
+
63
+
A **single** userspace module may implement **chunk ingestion** (`HandleChunk` or equivalent), rolling-buffer updates, policy **Scan**, allowlisting, deduplication, alert rate limits, redacted logging, and **enforcement-mode** metadata—whether cleartext chunks originate from a **Linux** ring buffer or from **future** platform capture (e.g., filtered socket path, endpoint-security write events). **Process-liveness** pruning for rolling state may use **OS-appropriate** PID enumeration (e.g., **procfs**-style directories, **process snapshot**, or **skip** when neither is available).
50
64
51
-
### 6. Edge path with shared policy semantics
65
+
### 7. Edge path with shared policy semantics
52
66
53
-
Where TLS is **terminated** upstream, an **HTTP** service may buffer **request bodies** (subject to a **maximum** size), apply the **same** policy engine and **alert** semantics as the host sensor for **cleartext** bodies, and optionally **reverse-proxy** to an origin. This **reduces** policy **divergence** between **host-observed** TLS writes and **ingress-observed** HTTP content.
67
+
Where TLS is **terminated** upstream, an **HTTP** service may buffer **request bodies** (subject to a **maximum** size), apply the **same** policy engine and **alert** semantics as the host sensor for **cleartext** bodies, and optionally **reverse-proxy** to an origin. This **reduces** policy **divergence** between **host-observed** TLS-related cleartext and **ingress-observed** HTTP content.
54
68
55
69
---
56
70
57
71
## Need for improvement
58
72
59
-
Accordingly, there exists a need for **improved** methods and systems that combine**kernel-assisted** capture (and **optional** constrained enforcement) with **userspace****multi-chunk** policy evaluation; **efficient** attachment to **dynamic** TLS libraries on **Linux**; **observable** handling of **capture backpressure**; and **consistent** policy handling across **host** and **edge** inspection paths.
73
+
Accordingly, there exists a need for **improved** methods and systems that combine, where available, **kernel-assisted** capture (and **optional** constrained enforcement) with **userspace****multi-chunk** policy evaluation; **efficient** attachment to **dynamic** TLS libraries on **Linux**; **observable** handling of **capture backpressure**; **consistent** policy handling across **host** and **edge** inspection paths; and **portable** policy-evaluation building blocks that can be **fed** by **platform-appropriate** capture layers on **non-Linux** endpoints without forcing a **duplicate** rule engine.
60
74
61
75
---
62
76
@@ -65,4 +79,4 @@ Accordingly, there exists a need for **improved** methods and systems that combi
65
79
- Replace generic “related art” discussion with **citations** after search.
66
80
- Align **Background** language with **claims** so as **not** to disclaim subject matter unintentionally.
67
81
- Cross-reference **[`INVENTION_DISCLOSURE_OUTLINE_US.md`](INVENTION_DISCLOSURE_OUTLINE_US.md)** for detailed embodiment lists and **figures**.
0 commit comments