Digital Twin Proxy

Turn web browsing into personal memory for AI agents.

Digital Twin Proxy logs web traffic and uses a local or remote large language model (LLM) to generate an analysis of your browsing patterns. It's designed for developers, researchers, and anyone interested in understanding their online activity through the lens of AI.

Features

HTTP/S Traffic Logging: Captures all web requests made through the proxy.
Agentic, Content-Aware Analysis: Uses an LLM to not only analyze traffic patterns but also to decide which pages to fetch and analyze in more depth.
Flexible Operation Modes: Run in the background, log traffic continuously, or perform one-off analysis.
Customizable: Easily change the AI model, analysis interval, and other settings.
OpenAI-Compatible: Works with any OpenAI-compatible API, including local providers like Ollama and LM Studio, as well as remote services like OpenAI, Groq, etc.
Privacy-Focused: By using a local LLM, you can ensure that your browsing history remains private and is not sent to any third-party service.

A Note on Privacy

This application is designed to work with your personal browsing history. As such, we strongly recommend using a local large language model (LLM) provider like Ollama, LM Studio, vLLM, or TGI. By running the LLM on your own machine, you can ensure that your browsing data remains private and is never sent to a third-party service.

While you can use any OpenAI-compatible API, please be aware of the privacy implications of sending your browsing data to a remote service.

Context for Agentic Applications

The primary output of Digital Twin Proxy is a structured log of your web traffic, along with AI-generated analysis. This data can serve as a powerful source of real-time context for other agentic applications.

By providing an analysis of recent browsing history, you can engineer a more informed context window for other AI agents, enabling them to:

Personalize responses: An agent can tailor its behavior based on your current tasks and interests.
Anticipate needs: An agent can proactively offer assistance based on the websites you are visiting.
Improve tool usage: An agent can better understand the context of your work and select the right tools for the job.

This process of "context engineering" allows you to create a more powerful and personalized AI experience.

Planned Features

MCP Server

We will soon expose the context from your digital twin as an MCP server to support AI agents.

In-Browser Context Injection

To create a more interactive and personalized web experience, we are developing a feature to inject real-time context directly into your browser for any agentic AI app (ChatGPT, Perplexity, etc.) to access your digital twin.

How It Works

The proxy operates by routing your browser's traffic through a local Squid instance. The application then uses an AI agent to analyze the traffic and decide which pages to fetch and analyze further.

Browser → Squid Proxy (port 8888) → Internet
              ↓
         Access Logs
              ↓
Digital Twin Proxy App → OpenAI-compatible API → Decides to fetch content → Fetches Page Content → OpenAI-compatible API → Analysis

Traffic Interception: Your browser is configured to send all HTTP and HTTPS requests to the Digital Twin Proxy listener on port 8888.
Logging: The proxy, powered by Squid, logs every request's URL and host.
Agentic Analysis: The digital-twin-proxy application sends the list of visited URLs to an LLM via an OpenAI-compatible API. The LLM then acts as an agent, deciding which URLs are interesting enough to warrant a deeper look.
Content Fetching: If the agent decides to investigate a URL, it uses a tool to fetch the content of that page.
In-Depth Analysis: The agent then analyzes the content of the fetched page to generate a more in-depth and meaningful summary of your browsing patterns.

Getting Started

Prerequisites

Rust toolchain
Squid proxy
LLM access via OpenAI-compatible API. We recommend local service like Ollama, LM Studio, vLLM, or TGI.

Installation

Clone the repository and build the project:

git clone https://github.com/kstonekuan/digital-twin-proxy.git
cd digital-twin-proxy
cargo build --release

The binary will be located at target/release/digital-twin-proxy.

Configuration

1. Configure Your Browser

Set your browser's HTTP and HTTPS proxy to 127.0.0.1:8888.

2. Configure the API Endpoint

You can configure the application using three methods (in order of priority):

Command-line flags (highest priority)
Environment variables
.env file (lowest priority)

Available Configuration Options

Option	Environment Variable	CLI Flag	Default	Description
API Base URL	`API_BASE`	`--api-base`	(required)	OpenAI-compatible API endpoint
API Key	`API_KEY`	`--api-key`	(optional)	API key for the service
Model	`MODEL`	`--model`	`gpt-oss:20b`	LLM model to use
Ambient Interval	`AMBIENT_INTERVAL`	`--interval`	`30`	Seconds between analyses (ambient mode)
Max Analysis Items	`MAX_ANALYSIS_ITEMS`	`--max-items`	`500`	Maximum URLs to analyze per batch

Configuration Methods

Method 1: Using a .env file:

Copy the example configuration and edit it:

cp .env.example .env
# Edit .env with your values

Example .env file:

API_BASE=http://localhost:11434/v1
API_KEY=your-api-key-if-needed
MODEL=gpt-oss:20b
AMBIENT_INTERVAL=60
MAX_ANALYSIS_ITEMS=1000

Method 2: Using environment variables:

export API_BASE=http://localhost:11434/v1
export MODEL=gpt-oss:20b
./digital-twin-proxy analyze --since 1h

Method 3: Using command-line flags:

./digital-twin-proxy analyze \
  --since 1h \
  --api-base http://localhost:11434/v1 \
  --model gpt-oss:20b \
  --max-items 1000

3. Verify

Start the proxy in logging mode and visit a website.

# Terminal 1: Start the proxy
./target/release/digital-twin-proxy log

# Terminal 2: Tail the logs
tail -f ~/.local/share/ai-proxy/log.ndjson

You should see JSON objects representing your web traffic.

Usage

Digital Twin Proxy has three main commands:

log: Start the proxy and only log traffic.
analyze: Perform a one-shot, content-aware analysis of traffic logged since a given duration.
ambient: Run the proxy and periodically perform content-aware analysis of traffic in the background.

Examples:

# Log traffic without analysis
./digital-twin-proxy log

# Analyze traffic from the last hour with a local Ollama model
./digital-twin-proxy analyze --since 1h --model gpt-oss:20b --api-base http://localhost:11434/v1

# Run in ambient mode, analyzing every 5 minutes with the OpenAI API
./digital-twin-proxy ambient --interval 300 --model gpt-5 --api-base https://api.openai.com/v1 --api-key $OPENAI_API_KEY

WSL (Windows Subsystem for Linux) Setup

If you're using WSL, there are additional networking considerations:

Install Squid in WSL:
```
sudo apt update
sudo apt install squid
```
Configure WSL networking:
- The proxy will run on 127.0.0.1:8888 within WSL
- From Windows, you'll need to access it via the WSL IP address
- Find your WSL IP: ip addr show eth0 | grep inet
Configure Windows browser:
- Set proxy to <WSL_IP>:8888 (e.g., 172.20.240.2:8888)
- Or use 127.0.0.1:8888 if you set up port forwarding

Optional - Set up port forwarding (Windows PowerShell as Administrator):

netsh interface portproxy add v4tov4 listenport=8888 listenaddress=127.0.0.1 connectport=8888 connectaddress=<WSL_IP>

Configure WSL firewall (if needed):
```
sudo ufw allow 8888
```

Development

This project uses rustfmt for formatting and clippy for linting.

# Format code
cargo fmt

# Run linter
cargo clippy --all-targets --all-features

# Build and run tests
cargo build
cargo test

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.cargo		.cargo
.github		.github
src		src
.env.example		.env.example
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
justfile		justfile
squid.conf		squid.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Digital Twin Proxy

Features

A Note on Privacy

Context for Agentic Applications

Planned Features

MCP Server

In-Browser Context Injection

How It Works

Getting Started

Prerequisites

Installation

Configuration

1. Configure Your Browser

2. Configure the API Endpoint

Available Configuration Options

Configuration Methods

3. Verify

Usage

WSL (Windows Subsystem for Linux) Setup

Development

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

kstonekuan/digital-twin-proxy

Folders and files

Latest commit

History

Repository files navigation

Digital Twin Proxy

Features

A Note on Privacy

Context for Agentic Applications

Planned Features

MCP Server

In-Browser Context Injection

How It Works

Getting Started

Prerequisites

Installation

Configuration

1. Configure Your Browser

2. Configure the API Endpoint

Available Configuration Options

Configuration Methods

3. Verify

Usage

WSL (Windows Subsystem for Linux) Setup

Development

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages