Skip to content

Bug Report : init_script + user_data_dir causes ERR_NAME_NOT_RESOLVED on all navigations #294

@gembleman

Description

@gembleman

Have you searched if there an existing issue for this?

  • I have searched the existing issues

Python version (python --version)

3.13

Scrapling version (scrapling.version)

0.4.8

Dependencies version (pip3 freeze)

Scrapling==0.4.8

What's your operating system?

windows 10

Are you using a separate virtual environment?

Yes

Expected behavior

200

Actual behavior

net::ERR_NAME_NOT_RESOLVED

Steps To Reproduce

I was too lazy to write the report, so I just had the AI ​​do it.

Affects: Scrapling 0.4.8 with patchright <= 1.59.1
Fixed by: upgrading patchright to 1.60.0


Summary

When AsyncStealthySession (or StealthySession) is created with both init_script and
user_data_dir, every page.goto() call fails with:

net::ERR_NAME_NOT_RESOLVED

The failure is 100% reproducible and affects all domains — including https://www.google.com.
The script content does not matter; even a single console.log("x") triggers the error.
Coincidentally, I use both the init_script and user_data_dir options! ;)


Minimal Reproduction

import asyncio
from scrapling.fetchers import AsyncStealthySession

async def main():
    async with AsyncStealthySession(
        headless=True,
        real_chrome=True,
        user_data_dir="test_profile",
        init_script="console.log('hello');",   # ← any script triggers the bug
    ) as session:
        response = await session.fetch("https://www.google.com")
        print(response.status)

asyncio.run(main())

Expected: 200

Actual:

net::ERR_NAME_NOT_RESOLVED at https://www.google.com/

Isolation matrix

init_script user_data_dir Result
✅ 200
✅ 200
✅ 200
ERR_NAME_NOT_RESOLVED

The bug only manifests when both options are used together.


Root Cause

How patchright implements add_init_script

Standard Playwright's Page.addScriptToEvaluateOnNewDocument CDP command does not work reliably
under launch_persistent_context. To work around this, patchright implements init script injection
by intercepting every document response and injecting a <script> tag into the HTML body
(crNetworkManager.jsfulfill()).

This requires a route handler to be registered on the context (install_inject_route /
installInjectRoute), which intercepts all requests and calls route.fallback() with a special
flag so the driver knows to intercept the response.

The mismatch between Python _impl and JS driver (patchright ≤ 1.59.1)

The JS driver (client/browserContext.js) was already updated to use a patchrightInitScript
flag:

// patchright/driver/package/lib/client/browserContext.js  (correct)
async installInjectRoute() {
    await this.route("**/*", async (route) => {
        if (route.request().resourceType() === "document" && route.request().url().startsWith("http")) {
            await route.fallback({ patchrightInitScript: true });  // ← flag only, no URL change
        } else {
            await route.fallback();
        }
    });
}

But the Python _impl layer (patchright/_impl/_browser_context.py) was not updated and
still used the old approach of redirecting to a fake .internal domain:

# patchright/_impl/_browser_context.py  (buggy, patchright <= 1.59.1)
async def install_inject_route(self) -> None:
    async def route_handler(route: Route) -> None:
        if route.request.resource_type == "document" and route.request.url.startswith("http"):
            protocol = route.request.url.split(":")[0]
            await route.fallback(
                url=f"{protocol}://patchright-init-script-inject.internal/"  # ← causes real DNS lookup
            )
        ...

Why this causes ERR_NAME_NOT_RESOLVED

When route.fallback(url="https://patchright-init-script-inject.internal/") is called,
Chromium treats it as a real navigation target and performs an actual DNS lookup for
patchright-init-script-inject.internal. Since this domain does not exist, every navigation
fails with ERR_NAME_NOT_RESOLVED before the page even loads.

session.fetch("https://www.google.com")
  → add_init_script() registers install_inject_route on context
  → page.goto("https://www.google.com")
      → route_handler intercepts (resource_type == "document")
      → route.fallback(url="https://patchright-init-script-inject.internal/")
          → Chromium performs DNS lookup for patchright-init-script-inject.internal
          → DNS resolution fails
  → ERR_NAME_NOT_RESOLVED ❌

This only occurs with user_data_dir because Scrapling uses launch_persistent_context in that
path (triggering patchright's init_script workaround), while the non-persistent path uses standard
launch() + new_context() where addInitScript works normally via CDP.


Fix

patchright 1.60.0 fixes this by updating the Python _impl to match the JS driver:

# patchright/_impl/_browser_context.py  (fixed, patchright 1.60.0)
async def install_inject_route(self) -> None:
    async def route_handler(route: Route) -> None:
        if route.request.resource_type == "document" and route.request.url.startswith("http"):
            await route.fallback(patchrightInitScript=True)  # ← no fake URL, flag only
        ...

Update the pinned version in Scrapling/pyproject.toml:

- "patchright==1.59.1",
+ "patchright==1.60.0",

Environment

Item Version
Scrapling 0.4.8
patchright (buggy) 1.56.0 – 1.59.1
patchright (fixed) 1.60.0
Python 3.13
OS Windows 10
Chrome real_chrome=True

References

  • Buggy code: patchright/_impl/_browser_context.pyinstall_inject_route()
  • Fix location: same file, same method — remove fake URL redirect, use patchrightInitScript=True flag
  • Driver-side injection logic: patchright/driver/package/lib/server/chromium/crNetworkManager.jsfulfill()

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions