[feature] Network Security Controls for Scrapling MCP

### Have you searched if there an existing feature request for this?

- [x] I have searched the existing requests

### Feature description

<hr>Title: Add Network Security Controls (allowed-origins, blocked-origins, allowed-hosts, isolated mode) to Scrapling MCP Server<hr>Description:Scrapling MCP is a powerful web scraping tool with excellent built-in protections against prompt injection, ad-blocking, and cloudflare bypass. However, it currently lacks critical network security controls that are essential for production deployments, especially when integrated with AI platforms like Open WebUI, Claude Desktop, or other MCP-compatible clients.Without these controls, the browser instance can access any URL, including internal networks, cloud metadata endpoints, and sensitive infrastructure. This creates significant risks in enterprise environments, multi-tenant deployments, or scenarios where the MCP server is exposed over HTTP.<hr><h2>Security Risks Without Network Controls</h2><ol><li>Internal Network Access: An attacker via prompt injection could instruct the scraper to access internal services at <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">10.x.x.x</code>, <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">192.168.x.x</code>, or <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">172.16-31.x.x</code></li><li>Cloud Metadata Theft: Access to <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">169.254.169.254</code> can leak AWS/GCP/Azure credentials and IAM roles</li><li>SSRF Attacks: Server-Side Request Forgery by scraping internal dashboards, APIs, or configuration endpoints</li><li>Unauthorized Service Discovery: Scanning and scraping internal management interfaces</li></ol><hr><h2>Proposed Features</h2><h3>1. <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">--allowed-origins</code> (Whitelist Mode)</h3>Restrict the browser to only access explicitly permitted domains or patterns.<pre style="background-color: rgb(246, 248, 250); border-radius: 6px; padding: 16px; overflow: auto;"><code class="language-bash" style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">scrapling mcp --allowed-origins "https://example.com,https://docs.example.com"
</code></pre>Behavior:<ul><li>Only requests to matching origins are allowed</li><li>All other requests are aborted automatically</li><li>Supports glob patterns: <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">https://*.example.com</code></li></ul>Reference Implementation: Microsoft Playwright MCP already implements this via <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">--allowed-origins</code> flag.<hr><h3>2. <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">--blocked-origins</code> (Blacklist Mode)</h3>Block specific origins while allowing all others.<pre style="background-color: rgb(246, 248, 250); border-radius: 6px; padding: 16px; overflow: auto;"><code class="language-bash" style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">scrapling mcp --blocked-origins "10.*,192.168.*,172.16.*,169.254.*,*.internal,*.local"
</code></pre>Behavior:<ul><li>Requests to matching origins are blocked</li><li>Useful when you need broad access but want to exclude dangerous ranges</li></ul>Common default blocked patterns should include:<pre style="background-color: rgb(246, 248, 250); border-radius: 6px; padding: 16px; overflow: auto;"><code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">10.0.0.0/8
172.16.0.0/12
192.168.0.0/16
169.254.0.0/16 (cloud metadata)
127.0.0.0/8
localhost
*.internal
*.local
*.lan
</code></pre><hr><h3>3. <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">--allowed-hosts</code> (Server-Level Host Verification)</h3>Restrict which clients can connect to the MCP server (especially important for HTTP mode).<pre style="background-color: rgb(246, 248, 250); border-radius: 6px; padding: 16px; overflow: auto;"><code class="language-bash" style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">scrapling mcp --http --host 0.0.0.0 --port 8000 \
 --allowed-hosts "localhost,127.0.0.1,10.0.0.0/8"
</code></pre>Behavior:<ul><li>Validates the <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">Host</code> header or client IP against the whitelist</li><li>Prevents unauthorized remote connections to the MCP server</li><li>DNS rebinding protection</li></ul>Currently, HTTP mode binds to <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">0.0.0.0</code> by default with no client verification.<hr><h3>4. <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">--isolated</code> Mode</h3>Run the browser with a fresh, ephemeral profile that is discarded on exit.<pre style="background-color: rgb(246, 248, 250); border-radius: 6px; padding: 16px; overflow: auto;"><code class="language-bash" style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">scrapling mcp --isolated
</code></pre>Behavior:<ul><li>No persistent cookies, cache, or storage between sessions</li><li>Prevents data leakage between different user sessions</li><li>Mitigates cookie/session hijacking if the MCP server is compromised</li></ul><hr><h3>5. Optional: <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">--proxy-server</code> and <code style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">--proxy-bypass</code></h3>Support routing browser traffic through an upstream proxy for centralized filtering.<pre style="background-color: rgb(246, 248, 250); border-radius: 6px; padding: 16px; overflow: auto;"><code class="language-bash" style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">scrapling mcp --proxy-server "http://squid:3128" \
 --proxy-bypass "localhost,&lt;local&gt;"
</code></pre>This allows organizations to enforce network policies at the proxy layer rather than relying solely on application-level controls.<hr><h2>Suggested Configuration File Support</h2>In addition to CLI flags, support a configuration file for easier deployment:<pre style="background-color: rgb(246, 248, 250); border-radius: 6px; padding: 16px; overflow: auto;"><code class="language-json" style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">{
 "network": {
 "allowedOrigins": [
 "https://example.com",
 "https://www.example.com"
 ],
 "blockedOrigins": [
 "10.*",
 "192.168.*",
 "172.16.*",
 "169.254.*",
 "*.internal",
 "*.local"
 ]
 },
 "server": {
 "allowedHosts": ["localhost", "127.0.0.1"]
 },
 "browser": {
 "isolated": true,
 "proxy": {
 "server": "http://proxy:3128",
 "bypass": "&lt;local&gt;"
 }
 }
}
</code></pre>Used via:<pre style="background-color: rgb(246, 248, 250); border-radius: 6px; padding: 16px; overflow: auto;"><code class="language-bash" style="font-family: SFMono-Regular, Consolas, &quot;Liberation Mono&quot;, Menlo, monospace; font-size: 14px;">scrapling mcp --config mcp-security.json
</code></pre><hr><h2>Why This Matters</h2>
Scenario | Risk Level | Impact
-- | -- | --
Prompt injection attack | High | Attacker redirects scraper to internal services
Multi-tenant AI platform | High | One tenant accesses another's internal network
Cloud metadata exposure | Critical | IAM credentials leaked, full account compromise
Enterprise deployment | Medium | Policy violations, data exfiltration

<hr>Happy to discuss implementation approaches or contribute to a PR if helpful. This would significantly improve the security posture of Scrapling MCP for production and enterprise deployments.</div>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature] Network Security Controls for Scrapling MCP #293

Have you searched if there an existing feature request for this?

Feature description

Security Risks Without Network Controls

Proposed Features

1. `--allowed-origins` (Whitelist Mode)

2. `--blocked-origins` (Blacklist Mode)

3. `--allowed-hosts` (Server-Level Host Verification)

4. `--isolated` Mode

5. Optional: `--proxy-server` and `--proxy-bypass`

Suggested Configuration File Support

Why This Matters

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[feature] Network Security Controls for Scrapling MCP #293

Description

Have you searched if there an existing feature request for this?

Feature description

Security Risks Without Network Controls

Proposed Features

1. --allowed-origins (Whitelist Mode)

2. --blocked-origins (Blacklist Mode)

3. --allowed-hosts (Server-Level Host Verification)

4. --isolated Mode

5. Optional: --proxy-server and --proxy-bypass

Suggested Configuration File Support

Why This Matters

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. `--allowed-origins` (Whitelist Mode)

2. `--blocked-origins` (Blacklist Mode)

3. `--allowed-hosts` (Server-Level Host Verification)

4. `--isolated` Mode

5. Optional: `--proxy-server` and `--proxy-bypass`