Have you searched if there an existing feature request for this?
Feature description
Title: Add Network Security Controls (allowed-origins, blocked-origins, allowed-hosts, isolated mode) to Scrapling MCP Server
Description:
Scrapling MCP is a powerful web scraping tool with excellent built-in protections against prompt injection, ad-blocking, and cloudflare bypass. However, it currently lacks critical network security controls that are essential for production deployments, especially when integrated with AI platforms like Open WebUI, Claude Desktop, or other MCP-compatible clients.
Without these controls, the browser instance can access any URL, including internal networks, cloud metadata endpoints, and sensitive infrastructure. This creates significant risks in enterprise environments, multi-tenant deployments, or scenarios where the MCP server is exposed over HTTP.
Security Risks Without Network Controls
- Internal Network Access: An attacker via prompt injection could instruct the scraper to access internal services at
10.x.x.x, 192.168.x.x, or 172.16-31.x.x - Cloud Metadata Theft: Access to
169.254.169.254 can leak AWS/GCP/Azure credentials and IAM roles - SSRF Attacks: Server-Side Request Forgery by scraping internal dashboards, APIs, or configuration endpoints
- Unauthorized Service Discovery: Scanning and scraping internal management interfaces
Proposed Features
1. --allowed-origins (Whitelist Mode)
Restrict the browser to only access explicitly permitted domains or patterns.
scrapling mcp --allowed-origins "https://example.com,https://docs.example.com"
Behavior:
- Only requests to matching origins are allowed
- All other requests are aborted automatically
- Supports glob patterns:
https://*.example.com
Reference Implementation: Microsoft Playwright MCP already implements this via --allowed-origins flag.
2. --blocked-origins (Blacklist Mode)
Block specific origins while allowing all others.
scrapling mcp --blocked-origins "10.*,192.168.*,172.16.*,169.254.*,*.internal,*.local"
Behavior:
- Requests to matching origins are blocked
- Useful when you need broad access but want to exclude dangerous ranges
Common default blocked patterns should include:
10.0.0.0/8
172.16.0.0/12
192.168.0.0/16
169.254.0.0/16 (cloud metadata)
127.0.0.0/8
localhost
*.internal
*.local
*.lan
3. --allowed-hosts (Server-Level Host Verification)
Restrict which clients can connect to the MCP server (especially important for HTTP mode).
scrapling mcp --http --host 0.0.0.0 --port 8000 \
--allowed-hosts "localhost,127.0.0.1,10.0.0.0/8"
Behavior:
- Validates the
Host header or client IP against the whitelist - Prevents unauthorized remote connections to the MCP server
- DNS rebinding protection
Currently, HTTP mode binds to 0.0.0.0 by default with no client verification.
4. --isolated Mode
Run the browser with a fresh, ephemeral profile that is discarded on exit.
Behavior:
- No persistent cookies, cache, or storage between sessions
- Prevents data leakage between different user sessions
- Mitigates cookie/session hijacking if the MCP server is compromised
5. Optional: --proxy-server and --proxy-bypass
Support routing browser traffic through an upstream proxy for centralized filtering.
scrapling mcp --proxy-server "http://squid:3128" \
--proxy-bypass "localhost,<local>"
This allows organizations to enforce network policies at the proxy layer rather than relying solely on application-level controls.
Suggested Configuration File Support
In addition to CLI flags, support a configuration file for easier deployment:
{
"network": {
"allowedOrigins": [
"https://example.com",
"https://www.example.com"
],
"blockedOrigins": [
"10.*",
"192.168.*",
"172.16.*",
"169.254.*",
"*.internal",
"*.local"
]
},
"server": {
"allowedHosts": ["localhost", "127.0.0.1"]
},
"browser": {
"isolated": true,
"proxy": {
"server": "http://proxy:3128",
"bypass": "<local>"
}
}
}
Used via:
scrapling mcp --config mcp-security.json
Why This Matters
Scenario | Risk Level | Impact
-- | -- | --
Prompt injection attack | High | Attacker redirects scraper to internal services
Multi-tenant AI platform | High | One tenant accesses another's internal network
Cloud metadata exposure | Critical | IAM credentials leaked, full account compromise
Enterprise deployment | Medium | Policy violations, data exfiltration
Happy to discuss implementation approaches or contribute to a PR if helpful. This would significantly improve the security posture of Scrapling MCP for production and enterprise deployments.
Have you searched if there an existing feature request for this?
Feature description
Title: Add Network Security Controls (allowed-origins, blocked-origins, allowed-hosts, isolated mode) to Scrapling MCP Server
Description:
Scrapling MCP is a powerful web scraping tool with excellent built-in protections against prompt injection, ad-blocking, and cloudflare bypass. However, it currently lacks critical network security controls that are essential for production deployments, especially when integrated with AI platforms like Open WebUI, Claude Desktop, or other MCP-compatible clients.
Without these controls, the browser instance can access any URL, including internal networks, cloud metadata endpoints, and sensitive infrastructure. This creates significant risks in enterprise environments, multi-tenant deployments, or scenarios where the MCP server is exposed over HTTP.
Security Risks Without Network Controls
10.x.x.x,192.168.x.x, or172.16-31.x.x169.254.169.254can leak AWS/GCP/Azure credentials and IAM rolesProposed Features
1.
--allowed-origins(Whitelist Mode)Restrict the browser to only access explicitly permitted domains or patterns.
Behavior:
https://*.example.comReference Implementation: Microsoft Playwright MCP already implements this via
--allowed-originsflag.2.
--blocked-origins(Blacklist Mode)Block specific origins while allowing all others.
Behavior:
Common default blocked patterns should include:
3.
--allowed-hosts(Server-Level Host Verification)Restrict which clients can connect to the MCP server (especially important for HTTP mode).
Behavior:
Hostheader or client IP against the whitelistCurrently, HTTP mode binds to
0.0.0.0by default with no client verification.4.
--isolatedModeRun the browser with a fresh, ephemeral profile that is discarded on exit.
Behavior:
5. Optional:
--proxy-serverand--proxy-bypassSupport routing browser traffic through an upstream proxy for centralized filtering.
This allows organizations to enforce network policies at the proxy layer rather than relying solely on application-level controls.
Suggested Configuration File Support
In addition to CLI flags, support a configuration file for easier deployment:
Used via:
Why This Matters
Scenario | Risk Level | Impact -- | -- | -- Prompt injection attack | High | Attacker redirects scraper to internal services Multi-tenant AI platform | High | One tenant accesses another's internal network Cloud metadata exposure | Critical | IAM credentials leaked, full account compromise Enterprise deployment | Medium | Policy violations, data exfiltrationHappy to discuss implementation approaches or contribute to a PR if helpful. This would significantly improve the security posture of Scrapling MCP for production and enterprise deployments.