fix(auth): replace raw Cookie header with httpx cookie jar to fix #273#276
fix(auth): replace raw Cookie header with httpx cookie jar to fix #273#276AmanSuryavanshi-1 wants to merge 2 commits intoteng-lin:mainfrom
Conversation
…g-lin#273 Cross-domain redirects (e.g., to accounts.google.com for token refresh) now properly retain cookies by using httpx.Cookies() jar instead of raw Cookie header. This fixes auth failures after ~10 minutes when Google's short-lived cookies expire and trigger a 302 redirect. Changes: - auth.py: fetch_tokens() now uses httpx.Cookies() jar - _core.py: ClientCore.open() uses cookies jar for HTTP client - _core.py: update_auth_headers() replaces cookies jar on token refresh - _core.py: add _build_cookies_jar() helper for cookie jar creation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR replaces manual "Cookie" header handling with httpx.Cookies jars across the client and auth flows. Client initialization and auth token fetching now use cookie jars (built with a Changes
Sequence Diagram(s)sequenceDiagram
participant ClientCore as ClientCore (builds jar)
participant Auth as Auth (extract/build cookies)
participant HTTPX as httpx.AsyncClient
participant Google as Google (accounts/*.google.com)
ClientCore->>Auth: request auth cookies / build jar
Auth-->>ClientCore: httpx.Cookies jar (.google.com)
ClientCore->>HTTPX: create AsyncClient(cookies=jar)
ClientCore->>HTTPX: request NotebookLM homepage
HTTPX->>Google: GET / (initial domain)
Google-->>HTTPX: 302 redirect -> accounts.google.com
HTTPX->>Google: GET / (accounts.google.com) (sends cookies from jar)
Google-->>HTTPX: 200 final HTML (with tokens)
HTTPX-->>ClientCore: response HTML
ClientCore->>ClientCore: extract CSRF/session tokens
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request transitions the HTTP client from using raw 'Cookie' headers to 'httpx.Cookies' jars to support cross-domain redirects during authentication. Feedback indicates that forcing the '.google.com' domain on all cookies may break functionality for other domains like '.googleusercontent.com' and poses a security risk regarding credential leakage. Additionally, the current implementation of updating auth headers may inadvertently overwrite session state provided by the server, and the logic for building the cookie jar is duplicated and should be centralized.
| for name, value in self.auth.cookies.items(): | ||
| # Use .google.com domain to cover all subdomains including | ||
| # accounts.google.com (used for token refresh redirects) | ||
| cookies.set(name, value, domain=".google.com") |
There was a problem hiding this comment.
Forcing the ".google.com" domain on all cookies is a problematic side effect of using a flat dictionary (self.auth.cookies) to populate the cookie jar. While this ensures cookies are sent to accounts.google.com during redirects, it also means that cookies originally intended for other domains—such as .googleusercontent.com (which is used for media downloads and is explicitly allowed in auth.py)—will now be sent to .google.com and not to their intended destination. This will likely break authenticated downloads from Google's content servers. Additionally, ensure that when making requests to external URLs using these cookies, the domain is validated against an allowlist of trusted domains to prevent credential leakage.
References
- When making requests to external URLs using authentication cookies, always validate that the URL's domain is on an allowlist of trusted domains to prevent credential leakage.
src/notebooklm/_core.py
Outdated
| raise RuntimeError("Client not initialized. Use 'async with' context.") | ||
| self._http_client.headers["Cookie"] = self.auth.cookie_header | ||
| # Replace cookies jar with updated auth tokens | ||
| self._http_client.cookies = self._build_cookies_jar() |
There was a problem hiding this comment.
Re-assigning the cookie jar here will overwrite any cookies that the server might have updated or set during the session (e.g., via Set-Cookie headers in previous requests). Since self.auth.cookies is a static dictionary initialized at startup and never updated, this call effectively reverts the client's session state to its initial values. Given that csrf_token and session_id are handled separately and are not stored in this cookie jar, this reset may be unnecessary and potentially harmful to session persistence.
| cookie_jar = httpx.Cookies() | ||
| for name, value in cookies.items(): | ||
| cookie_jar.set(name, value, domain=".google.com") |
There was a problem hiding this comment.
This logic for building a cookie jar is duplicated in ClientCore._build_cookies_jar. It should be centralized (e.g., as a method on AuthTokens) to ensure consistent behavior. Additionally, as noted in the core client review, forcing the .google.com domain for all cookies may break functionality for other domains like googleusercontent.com which are used for media downloads.
There was a problem hiding this comment.
Pull request overview
Updates NotebookLM authentication HTTP handling to use httpx.Cookies instead of a raw Cookie header, aiming to preserve session cookies across Google redirect flows (notebooklm → accounts → notebooklm) described in Issue #273.
Changes:
- Switch
auth.fetch_tokens()to build and pass anhttpx.Cookiesjar (vs. a rawCookie:header). - Switch
ClientCore.open()to configure the sharedhttpx.AsyncClientwith a cookies jar (vs.headers["Cookie"]). - Update
ClientCore.update_auth_headers()to refresh the client’s cookies via a new_build_cookies_jar()helper.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/notebooklm/auth.py |
Builds an httpx.Cookies jar for token-fetching requests that may redirect across Google hosts. |
src/notebooklm/_core.py |
Configures the core httpx.AsyncClient to use a cookies jar and adds a helper to construct it from AuthTokens. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Build httpx.Cookies jar instead of raw header to properly handle | ||
| # cross-domain redirects (e.g., to accounts.google.com for refresh) | ||
| cookie_jar = httpx.Cookies() | ||
| for name, value in cookies.items(): | ||
| cookie_jar.set(name, value, domain=".google.com") |
There was a problem hiding this comment.
fetch_tokens() builds a cookie jar from the filtered cookies dict, but extract_cookies_from_storage() currently excludes cookies whose domain is accounts.google.com (because _is_allowed_auth_domain() only allows ALLOWED_COOKIE_DOMAINS and regional .google.*). If the cross-host refresh flow depends on accounts-scoped cookies (as described in #273), they will never make it into this jar, so redirects to accounts.google.com can still fail. Consider expanding the auth-domain allowlist to include accounts.google.com (and possibly .accounts.google.com) or switching token fetching to use a domain-preserving httpx.Cookies source (similar to load_httpx_cookies()).
| # Build httpx.Cookies jar instead of raw header to properly handle | |
| # cross-domain redirects (e.g., to accounts.google.com for refresh) | |
| cookie_jar = httpx.Cookies() | |
| for name, value in cookies.items(): | |
| cookie_jar.set(name, value, domain=".google.com") | |
| # Build an httpx.Cookies jar instead of a raw Cookie header so cookies | |
| # participate in redirect handling. The input is a flattened name/value | |
| # mapping, so domain specificity has already been lost by this point. | |
| # To preserve current behavior while allowing redirects through the Google | |
| # Accounts host, register each cookie for both the general Google domain | |
| # and the concrete accounts.google.com host. | |
| cookie_jar = httpx.Cookies() | |
| for name, value in cookies.items(): | |
| cookie_jar.set(name, value, domain=".google.com") | |
| cookie_jar.set(name, value, domain="accounts.google.com") |
src/notebooklm/_core.py
Outdated
| if not self._http_client: | ||
| raise RuntimeError("Client not initialized. Use 'async with' context.") | ||
| self._http_client.headers["Cookie"] = self.auth.cookie_header | ||
| # Replace cookies jar with updated auth tokens | ||
| self._http_client.cookies = self._build_cookies_jar() |
There was a problem hiding this comment.
update_auth_headers() replaces the entire AsyncClient.cookies jar with a new jar rebuilt from self.auth.cookies. Since self.auth.cookies is not updated by refresh_auth() (it only updates CSRF/session_id), this can discard any cookies that were refreshed via Set-Cookie during redirect flows (e.g., the short-lived __Secure-*PSIDRTS cookies), undermining the goal of using a persistent cookie jar. Prefer keeping the existing jar and only mutating it when the cookie values actually change (or explicitly syncing refreshed cookies back into self.auth.cookies before rebuilding).
| async with httpx.AsyncClient(cookies=cookie_jar) as client: | ||
| response = await client.get( | ||
| "https://notebooklm.google.com/", | ||
| headers={"Cookie": cookie_header}, | ||
| follow_redirects=True, | ||
| timeout=30.0, | ||
| ) |
There was a problem hiding this comment.
The new redirect-handling behavior in fetch_tokens() is not directly asserted in tests. There are existing pytest_httpx patterns in this repo to inspect outgoing requests; adding an assertion that the redirected request to accounts.google.com still includes a Cookie header (and/or that the request chain completes when cookies are valid) would prevent regressions of #273.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/notebooklm/_core.py`:
- Around line 179-180: The current code replaces the httpx cookie jar object
(self._http_client.cookies = self._build_cookies_jar()), which drops any cookies
httpx added during requests or redirects; instead, call _build_cookies_jar() to
get the new cookies and merge them into the existing jar
(self._http_client.cookies) — e.g., iterate the cookies from
_build_cookies_jar() and set/update them into self._http_client.cookies
(preserving domain/path/attributes) rather than assigning a new httpx.Cookies
object.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7051df54-2ea6-4006-9bb5-cab3ef3ee544
📒 Files selected for processing (2)
src/notebooklm/_core.pysrc/notebooklm/auth.py
…coping and lifetimes - Add accounts.google.com to ALLOWED_COOKIE_DOMAINS for token refresh - Fix update_auth_headers() to merge cookies instead of replacing (preserves live cookies received from redirects like accounts.google.com) - Add extract_cookies_with_domains() to preserve original cookie domains - Add build_httpx_cookies_from_storage() for proper domain-aware cookie jars - Add fetch_tokens_with_domains() variant that uses original domains - Add TestCrossDomainCookiePreservation tests for redirect cookie handling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integration/test_core.py`:
- Around line 462-490: The test currently only checks cookie presence; to prove
merge (not replace) change test_update_auth_headers_merges_not_replaces to set a
unique sentinel value (e.g., "redirect-only-sentinel") on http_client.cookies
for "SID" (domain=".google.com") before calling core.update_auth_headers(), then
after calling core.update_auth_headers() assert that
http_client.cookies.get("SID", domain=".google.com") == "redirect-only-sentinel"
(and optionally assert other expected tokens still exist). Locate this logic
around NotebookLMClient usage and
core.update_auth_headers()/http_client.cookies.set/get to implement the stronger
equality assertion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9e59a70d-9893-4a16-bffd-abefc58686b8
📒 Files selected for processing (3)
src/notebooklm/_core.pysrc/notebooklm/auth.pytests/integration/test_core.py
🚧 Files skipped from review as they are similar to previous changes (2)
- src/notebooklm/_core.py
- src/notebooklm/auth.py
| http_client.cookies.set("SID", "original_sid", domain=".google.com") | ||
| http_client.cookies.set("HSID", "original_hsid", domain=".google.com") | ||
|
|
||
| # Simulate what happens during a redirect: update_auth_headers merges new cookies | ||
| # without wiping existing ones (like refreshed SID from accounts.google.com) | ||
| core.update_auth_headers() | ||
|
|
||
| # Verify original cookies are still present (not wiped) | ||
| # httpx.Cookies.get() returns None if cookie not found | ||
| assert http_client.cookies.get("SID", domain=".google.com") is not None | ||
| assert http_client.cookies.get("HSID", domain=".google.com") is not None | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_update_auth_headers_merges_not_replaces(self, auth_tokens): | ||
| """Verify update_auth_headers merges new cookies, preserving live redirect cookies.""" | ||
| async with NotebookLMClient(auth_tokens) as client: | ||
| core = client._core | ||
| http_client = core._http_client | ||
|
|
||
| # Simulate a live cookie received from accounts.google.com redirect | ||
| http_client.cookies.set("SID", "refreshed_sid_from_redirect", domain=".google.com") | ||
|
|
||
| # Now update auth headers (simulating a token refresh) | ||
| core.update_auth_headers() | ||
|
|
||
| # The "refreshed" cookie should still be there (merged, not replaced) | ||
| sid_cookie = http_client.cookies.get("SID", domain=".google.com") | ||
| # The value should be either the original or the refreshed one (merged) | ||
| assert sid_cookie is not None |
There was a problem hiding this comment.
Strengthen assertions to prove merge behavior (not just presence).
Current checks (is not None) are too weak and can still pass if cookies are replaced/overwritten.
Use a redirect-only sentinel cookie and assert exact value/domain survives after update_auth_headers().
Suggested test tightening
@@
- http_client.cookies.set("SID", "original_sid", domain=".google.com")
- http_client.cookies.set("HSID", "original_hsid", domain=".google.com")
+ http_client.cookies.set("SID", "original_sid", domain=".google.com")
+ http_client.cookies.set(
+ "REDIRECT_ONLY", "live_cookie", domain="accounts.google.com"
+ )
@@
- assert http_client.cookies.get("SID", domain=".google.com") is not None
- assert http_client.cookies.get("HSID", domain=".google.com") is not None
+ assert http_client.cookies.get("SID", domain=".google.com") is not None
+ assert (
+ http_client.cookies.get("REDIRECT_ONLY", domain="accounts.google.com")
+ == "live_cookie"
+ )
@@
- http_client.cookies.set("SID", "refreshed_sid_from_redirect", domain=".google.com")
+ http_client.cookies.set(
+ "SID", "refreshed_sid_from_redirect", domain="accounts.google.com"
+ )
@@
- sid_cookie = http_client.cookies.get("SID", domain=".google.com")
- # The value should be either the original or the refreshed one (merged)
- assert sid_cookie is not None
+ assert (
+ http_client.cookies.get("SID", domain="accounts.google.com")
+ == "refreshed_sid_from_redirect"
+ )
+ assert http_client.cookies.get("SID", domain=".google.com") is not None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration/test_core.py` around lines 462 - 490, The test currently
only checks cookie presence; to prove merge (not replace) change
test_update_auth_headers_merges_not_replaces to set a unique sentinel value
(e.g., "redirect-only-sentinel") on http_client.cookies for "SID"
(domain=".google.com") before calling core.update_auth_headers(), then after
calling core.update_auth_headers() assert that http_client.cookies.get("SID",
domain=".google.com") == "redirect-only-sentinel" (and optionally assert other
expected tokens still exist). Locate this logic around NotebookLMClient usage
and core.update_auth_headers()/http_client.cookies.set/get to implement the
stronger equality assertion.
Fixes Issue #273 where short-lived 10 minute Google cookies triggering a 302 redirect to accounts.google.com would drop the raw Cookie headers and kill the session. Migrated the auth fetchers and the RPC ClientCore to use a persistent
httpx.Cookiesjar.Summary
Problem: Auth fails after ~10 minutes due to raw Cookie header being dropped on cross-domain redirects (Google's cookies expire every 10-15 minutes and trigger a 302 to
accounts.google.comto refresh them).Solution: Replace raw
Cookie:header withhttpx.Cookies()jar which properly handles cross-domain redirects.Changes Made
src/notebooklm/auth.py-fetch_tokens(): Useshttpx.Cookies()jar with domain=".google.com"src/notebooklm/_core.py-ClientCore.open(): Uses cookies jar for HTTP clientsrc/notebooklm/_core.py-update_auth_headers(): Replaces cookies jar on token refresh_build_cookies_jar()helper methodTest Plan
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Chores