Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 26 additions & 3 deletions src/notebooklm/_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,8 @@ async def open(self) -> None:
"""Open the HTTP client connection.

Called automatically by NotebookLMClient.__aenter__.
Uses httpx.Cookies jar to properly handle cross-domain redirects
(e.g., to accounts.google.com for auth token refresh).
"""
if self._http_client is None:
# Use granular timeouts: shorter connect timeout helps detect network issues
Expand All @@ -138,11 +140,13 @@ async def open(self) -> None:
write=self._timeout,
pool=self._timeout,
)
# Build cookies jar for cross-domain redirect support
cookies = self._build_cookies_jar()
self._http_client = httpx.AsyncClient(
headers={
"Content-Type": "application/x-www-form-urlencoded;charset=UTF-8",
"Cookie": self.auth.cookie_header,
},
cookies=cookies,
timeout=timeout,
)

Expand All @@ -161,17 +165,36 @@ def is_open(self) -> bool:
return self._http_client is not None

def update_auth_headers(self) -> None:
"""Update HTTP client headers with current auth tokens.
"""Update HTTP client cookies with current auth tokens.

Call this after modifying auth tokens (e.g., after refresh_auth())
to ensure the HTTP client uses the updated credentials.
Uses httpx.Cookies jar to properly handle cross-domain redirects.

Raises:
RuntimeError: If client is not initialized.
"""
if not self._http_client:
raise RuntimeError("Client not initialized. Use 'async with' context.")
self._http_client.headers["Cookie"] = self.auth.cookie_header
# Replace cookies jar with updated auth tokens
self._http_client.cookies = self._build_cookies_jar()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Re-assigning the cookie jar here will overwrite any cookies that the server might have updated or set during the session (e.g., via Set-Cookie headers in previous requests). Since self.auth.cookies is a static dictionary initialized at startup and never updated, this call effectively reverts the client's session state to its initial values. Given that csrf_token and session_id are handled separately and are not stored in this cookie jar, this reset may be unnecessary and potentially harmful to session persistence.

Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_auth_headers() replaces the entire AsyncClient.cookies jar with a new jar rebuilt from self.auth.cookies. Since self.auth.cookies is not updated by refresh_auth() (it only updates CSRF/session_id), this can discard any cookies that were refreshed via Set-Cookie during redirect flows (e.g., the short-lived __Secure-*PSIDRTS cookies), undermining the goal of using a persistent cookie jar. Prefer keeping the existing jar and only mutating it when the cookie values actually change (or explicitly syncing refreshed cookies back into self.auth.cookies before rebuilding).

Copilot uses AI. Check for mistakes.

def _build_cookies_jar(self) -> httpx.Cookies:
"""Build an httpx.Cookies jar from auth tokens.

Uses .google.com as the domain to ensure cookies are sent
across all Google subdomains including accounts.google.com
for cross-domain auth refresh redirects.

Returns:
httpx.Cookies jar populated with auth cookies.
"""
cookies = httpx.Cookies()
for name, value in self.auth.cookies.items():
# Use .google.com domain to cover all subdomains including
# accounts.google.com (used for token refresh redirects)
cookies.set(name, value, domain=".google.com")
Comment on lines +200 to +203
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Forcing the ".google.com" domain on all cookies is a problematic side effect of using a flat dictionary (self.auth.cookies) to populate the cookie jar. While this ensures cookies are sent to accounts.google.com during redirects, it also means that cookies originally intended for other domains—such as .googleusercontent.com (which is used for media downloads and is explicitly allowed in auth.py)—will now be sent to .google.com and not to their intended destination. This will likely break authenticated downloads from Google's content servers. Additionally, ensure that when making requests to external URLs using these cookies, the domain is validated against an allowlist of trusted domains to prevent credential leakage.

References
  1. When making requests to external URLs using authentication cookies, always validate that the URL's domain is on an allowlist of trusted domains to prevent credential leakage.

return cookies

def _build_url(self, rpc_method: RPCMethod, source_path: str = "/") -> str:
"""Build the batchexecute URL for an RPC call.
Expand Down
13 changes: 9 additions & 4 deletions src/notebooklm/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -646,7 +646,8 @@ async def fetch_tokens(cookies: dict[str, str]) -> tuple[str, str]:
"""Fetch CSRF token and session ID from NotebookLM homepage.

Makes an authenticated request to NotebookLM and extracts the required
tokens from the page HTML.
tokens from the page HTML. Uses httpx.Cookies() jar to properly handle
cross-domain redirects (e.g., to accounts.google.com for token refresh).

Args:
cookies: Dict of Google auth cookies
Expand All @@ -659,12 +660,16 @@ async def fetch_tokens(cookies: dict[str, str]) -> tuple[str, str]:
ValueError: If tokens cannot be extracted from response
"""
logger.debug("Fetching CSRF and session tokens from NotebookLM")
cookie_header = "; ".join(f"{k}={v}" for k, v in cookies.items())

async with httpx.AsyncClient() as client:
# Build httpx.Cookies jar instead of raw header to properly handle
# cross-domain redirects (e.g., to accounts.google.com for refresh)
cookie_jar = httpx.Cookies()
for name, value in cookies.items():
cookie_jar.set(name, value, domain=".google.com")
Comment on lines +743 to +745
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for building a cookie jar is duplicated in ClientCore._build_cookies_jar. It should be centralized (e.g., as a method on AuthTokens) to ensure consistent behavior. Additionally, as noted in the core client review, forcing the .google.com domain for all cookies may break functionality for other domains like googleusercontent.com which are used for media downloads.

Comment on lines +741 to +745
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetch_tokens() builds a cookie jar from the filtered cookies dict, but extract_cookies_from_storage() currently excludes cookies whose domain is accounts.google.com (because _is_allowed_auth_domain() only allows ALLOWED_COOKIE_DOMAINS and regional .google.*). If the cross-host refresh flow depends on accounts-scoped cookies (as described in #273), they will never make it into this jar, so redirects to accounts.google.com can still fail. Consider expanding the auth-domain allowlist to include accounts.google.com (and possibly .accounts.google.com) or switching token fetching to use a domain-preserving httpx.Cookies source (similar to load_httpx_cookies()).

Suggested change
# Build httpx.Cookies jar instead of raw header to properly handle
# cross-domain redirects (e.g., to accounts.google.com for refresh)
cookie_jar = httpx.Cookies()
for name, value in cookies.items():
cookie_jar.set(name, value, domain=".google.com")
# Build an httpx.Cookies jar instead of a raw Cookie header so cookies
# participate in redirect handling. The input is a flattened name/value
# mapping, so domain specificity has already been lost by this point.
# To preserve current behavior while allowing redirects through the Google
# Accounts host, register each cookie for both the general Google domain
# and the concrete accounts.google.com host.
cookie_jar = httpx.Cookies()
for name, value in cookies.items():
cookie_jar.set(name, value, domain=".google.com")
cookie_jar.set(name, value, domain="accounts.google.com")

Copilot uses AI. Check for mistakes.

async with httpx.AsyncClient(cookies=cookie_jar) as client:
response = await client.get(
"https://notebooklm.google.com/",
headers={"Cookie": cookie_header},
follow_redirects=True,
timeout=30.0,
)
Comment on lines +747 to 752
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new redirect-handling behavior in fetch_tokens() is not directly asserted in tests. There are existing pytest_httpx patterns in this repo to inspect outgoing requests; adding an assertion that the redirected request to accounts.google.com still includes a Cookie header (and/or that the request chain completes when cookies are valid) would prevent regressions of #273.

Copilot uses AI. Check for mistakes.
Expand Down
Loading