Skip to content

fix(auth): replace raw Cookie header with httpx cookie jar to fix #273#276

Open
AmanSuryavanshi-1 wants to merge 2 commits intoteng-lin:mainfrom
AmanSuryavanshi-1:fix/issue-273-cross-domain-cookies
Open

fix(auth): replace raw Cookie header with httpx cookie jar to fix #273#276
AmanSuryavanshi-1 wants to merge 2 commits intoteng-lin:mainfrom
AmanSuryavanshi-1:fix/issue-273-cross-domain-cookies

Conversation

@AmanSuryavanshi-1
Copy link
Copy Markdown

@AmanSuryavanshi-1 AmanSuryavanshi-1 commented Apr 13, 2026

Fixes Issue #273 where short-lived 10 minute Google cookies triggering a 302 redirect to accounts.google.com would drop the raw Cookie headers and kill the session. Migrated the auth fetchers and the RPC ClientCore to use a persistent httpx.Cookies jar.

Summary

Problem: Auth fails after ~10 minutes due to raw Cookie header being dropped on cross-domain redirects (Google's cookies expire every 10-15 minutes and trigger a 302 to accounts.google.com to refresh them).

Solution: Replace raw Cookie: header with httpx.Cookies() jar which properly handles cross-domain redirects.

Changes Made

  1. src/notebooklm/auth.py - fetch_tokens(): Uses httpx.Cookies() jar with domain=".google.com"
  2. src/notebooklm/_core.py - ClientCore.open(): Uses cookies jar for HTTP client
  3. src/notebooklm/_core.py - update_auth_headers(): Replaces cookies jar on token refresh
  4. Added _build_cookies_jar() helper method

Test Plan

  • All unit tests pass (179 tests)
  • All integration tests pass
  • Pre-commit checks pass (ruff, mypy)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Improved authentication to preserve cookies across Google subdomains and redirects, reducing sign-in/token refresh failures and improving session reliability.
  • Chores

    • Updated HTTP client to use a centralized cookie jar approach for consistent cookie handling and easier maintenance.

…g-lin#273

Cross-domain redirects (e.g., to accounts.google.com for token refresh)
now properly retain cookies by using httpx.Cookies() jar instead of
raw Cookie header. This fixes auth failures after ~10 minutes when
Google's short-lived cookies expire and trigger a 302 redirect.

Changes:
- auth.py: fetch_tokens() now uses httpx.Cookies() jar
- _core.py: ClientCore.open() uses cookies jar for HTTP client
- _core.py: update_auth_headers() replaces cookies jar on token refresh
- _core.py: add _build_cookies_jar() helper for cookie jar creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 13, 2026 08:01
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

📝 Walkthrough

Walkthrough

The PR replaces manual "Cookie" header handling with httpx.Cookies jars across the client and auth flows. Client initialization and auth token fetching now use cookie jars (built with a .google.com domain), and update_auth_headers() merges cookies into the client's jar. Tests for cross-domain cookie preservation were added.

Changes

Cohort / File(s) Summary
HTTP Client Core
src/notebooklm/_core.py
Adds ClientCore._build_cookies_jar() and initializes httpx.AsyncClient with a cookies jar; update_auth_headers() now merges/sets cookies into the client's http_client.cookies instead of overwriting a "Cookie" header.
Authentication & Cookie helpers
src/notebooklm/auth.py
Adds extract_cookies_with_domains(), build_httpx_cookies_from_storage(), and fetch_tokens_with_domains(); updates fetch_tokens() to accept/use an httpx.Cookies jar (sets cookies on .google.com) and removes manual Cookie header construction. Expands domain allowlist to include accounts.google.com.
Integration tests
tests/integration/test_core.py
Adds TestCrossDomainCookiePreservation with tests verifying cookies are preserved/merged on cross-domain redirects and that update_auth_headers() merges rather than replaces cookies.

Sequence Diagram(s)

sequenceDiagram
  participant ClientCore as ClientCore (builds jar)
  participant Auth as Auth (extract/build cookies)
  participant HTTPX as httpx.AsyncClient
  participant Google as Google (accounts/*.google.com)

  ClientCore->>Auth: request auth cookies / build jar
  Auth-->>ClientCore: httpx.Cookies jar (.google.com)
  ClientCore->>HTTPX: create AsyncClient(cookies=jar)
  ClientCore->>HTTPX: request NotebookLM homepage
  HTTPX->>Google: GET / (initial domain)
  Google-->>HTTPX: 302 redirect -> accounts.google.com
  HTTPX->>Google: GET / (accounts.google.com) (sends cookies from jar)
  Google-->>HTTPX: 200 final HTML (with tokens)
  HTTPX-->>ClientCore: response HTML
  ClientCore->>ClientCore: extract CSRF/session tokens
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

  • Issue #273: Implements the change requested—switching from raw "Cookie" header to httpx.Cookies jar and preserving cross-domain Google auth cookies (including accounts.google.com).

Possibly related PRs

  • PR #245: Related changes to auth cookie handling and Playwright/httpx cookie jar construction overlap with this PR's work.

Poem

🐰 I built a jar both snug and warm,
No brittle headers in the storm.
Across google lands the cookies roam,
They find their way and make a home. 🍪✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: replacing raw Cookie headers with httpx cookie jar to fix issue #273, which is the core purpose of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request transitions the HTTP client from using raw 'Cookie' headers to 'httpx.Cookies' jars to support cross-domain redirects during authentication. Feedback indicates that forcing the '.google.com' domain on all cookies may break functionality for other domains like '.googleusercontent.com' and poses a security risk regarding credential leakage. Additionally, the current implementation of updating auth headers may inadvertently overwrite session state provided by the server, and the logic for building the cookie jar is duplicated and should be centralized.

Comment on lines +193 to +196
for name, value in self.auth.cookies.items():
# Use .google.com domain to cover all subdomains including
# accounts.google.com (used for token refresh redirects)
cookies.set(name, value, domain=".google.com")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Forcing the ".google.com" domain on all cookies is a problematic side effect of using a flat dictionary (self.auth.cookies) to populate the cookie jar. While this ensures cookies are sent to accounts.google.com during redirects, it also means that cookies originally intended for other domains—such as .googleusercontent.com (which is used for media downloads and is explicitly allowed in auth.py)—will now be sent to .google.com and not to their intended destination. This will likely break authenticated downloads from Google's content servers. Additionally, ensure that when making requests to external URLs using these cookies, the domain is validated against an allowlist of trusted domains to prevent credential leakage.

References
  1. When making requests to external URLs using authentication cookies, always validate that the URL's domain is on an allowlist of trusted domains to prevent credential leakage.

raise RuntimeError("Client not initialized. Use 'async with' context.")
self._http_client.headers["Cookie"] = self.auth.cookie_header
# Replace cookies jar with updated auth tokens
self._http_client.cookies = self._build_cookies_jar()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Re-assigning the cookie jar here will overwrite any cookies that the server might have updated or set during the session (e.g., via Set-Cookie headers in previous requests). Since self.auth.cookies is a static dictionary initialized at startup and never updated, this call effectively reverts the client's session state to its initial values. Given that csrf_token and session_id are handled separately and are not stored in this cookie jar, this reset may be unnecessary and potentially harmful to session persistence.

Comment on lines +666 to +668
cookie_jar = httpx.Cookies()
for name, value in cookies.items():
cookie_jar.set(name, value, domain=".google.com")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for building a cookie jar is duplicated in ClientCore._build_cookies_jar. It should be centralized (e.g., as a method on AuthTokens) to ensure consistent behavior. Additionally, as noted in the core client review, forcing the .google.com domain for all cookies may break functionality for other domains like googleusercontent.com which are used for media downloads.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates NotebookLM authentication HTTP handling to use httpx.Cookies instead of a raw Cookie header, aiming to preserve session cookies across Google redirect flows (notebooklm → accounts → notebooklm) described in Issue #273.

Changes:

  • Switch auth.fetch_tokens() to build and pass an httpx.Cookies jar (vs. a raw Cookie: header).
  • Switch ClientCore.open() to configure the shared httpx.AsyncClient with a cookies jar (vs. headers["Cookie"]).
  • Update ClientCore.update_auth_headers() to refresh the client’s cookies via a new _build_cookies_jar() helper.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/notebooklm/auth.py Builds an httpx.Cookies jar for token-fetching requests that may redirect across Google hosts.
src/notebooklm/_core.py Configures the core httpx.AsyncClient to use a cookies jar and adds a helper to construct it from AuthTokens.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +664 to +668
# Build httpx.Cookies jar instead of raw header to properly handle
# cross-domain redirects (e.g., to accounts.google.com for refresh)
cookie_jar = httpx.Cookies()
for name, value in cookies.items():
cookie_jar.set(name, value, domain=".google.com")
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetch_tokens() builds a cookie jar from the filtered cookies dict, but extract_cookies_from_storage() currently excludes cookies whose domain is accounts.google.com (because _is_allowed_auth_domain() only allows ALLOWED_COOKIE_DOMAINS and regional .google.*). If the cross-host refresh flow depends on accounts-scoped cookies (as described in #273), they will never make it into this jar, so redirects to accounts.google.com can still fail. Consider expanding the auth-domain allowlist to include accounts.google.com (and possibly .accounts.google.com) or switching token fetching to use a domain-preserving httpx.Cookies source (similar to load_httpx_cookies()).

Suggested change
# Build httpx.Cookies jar instead of raw header to properly handle
# cross-domain redirects (e.g., to accounts.google.com for refresh)
cookie_jar = httpx.Cookies()
for name, value in cookies.items():
cookie_jar.set(name, value, domain=".google.com")
# Build an httpx.Cookies jar instead of a raw Cookie header so cookies
# participate in redirect handling. The input is a flattened name/value
# mapping, so domain specificity has already been lost by this point.
# To preserve current behavior while allowing redirects through the Google
# Accounts host, register each cookie for both the general Google domain
# and the concrete accounts.google.com host.
cookie_jar = httpx.Cookies()
for name, value in cookies.items():
cookie_jar.set(name, value, domain=".google.com")
cookie_jar.set(name, value, domain="accounts.google.com")

Copilot uses AI. Check for mistakes.
Comment on lines +177 to +180
if not self._http_client:
raise RuntimeError("Client not initialized. Use 'async with' context.")
self._http_client.headers["Cookie"] = self.auth.cookie_header
# Replace cookies jar with updated auth tokens
self._http_client.cookies = self._build_cookies_jar()
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_auth_headers() replaces the entire AsyncClient.cookies jar with a new jar rebuilt from self.auth.cookies. Since self.auth.cookies is not updated by refresh_auth() (it only updates CSRF/session_id), this can discard any cookies that were refreshed via Set-Cookie during redirect flows (e.g., the short-lived __Secure-*PSIDRTS cookies), undermining the goal of using a persistent cookie jar. Prefer keeping the existing jar and only mutating it when the cookie values actually change (or explicitly syncing refreshed cookies back into self.auth.cookies before rebuilding).

Copilot uses AI. Check for mistakes.
Comment on lines +670 to 675
async with httpx.AsyncClient(cookies=cookie_jar) as client:
response = await client.get(
"https://notebooklm.google.com/",
headers={"Cookie": cookie_header},
follow_redirects=True,
timeout=30.0,
)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new redirect-handling behavior in fetch_tokens() is not directly asserted in tests. There are existing pytest_httpx patterns in this repo to inspect outgoing requests; adding an assertion that the redirected request to accounts.google.com still includes a Cookie header (and/or that the request chain completes when cookies are valid) would prevent regressions of #273.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/notebooklm/_core.py`:
- Around line 179-180: The current code replaces the httpx cookie jar object
(self._http_client.cookies = self._build_cookies_jar()), which drops any cookies
httpx added during requests or redirects; instead, call _build_cookies_jar() to
get the new cookies and merge them into the existing jar
(self._http_client.cookies) — e.g., iterate the cookies from
_build_cookies_jar() and set/update them into self._http_client.cookies
(preserving domain/path/attributes) rather than assigning a new httpx.Cookies
object.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7051df54-2ea6-4006-9bb5-cab3ef3ee544

📥 Commits

Reviewing files that changed from the base of the PR and between a997718 and 2990ba2.

📒 Files selected for processing (2)
  • src/notebooklm/_core.py
  • src/notebooklm/auth.py

…coping and lifetimes

- Add accounts.google.com to ALLOWED_COOKIE_DOMAINS for token refresh
- Fix update_auth_headers() to merge cookies instead of replacing (preserves
  live cookies received from redirects like accounts.google.com)
- Add extract_cookies_with_domains() to preserve original cookie domains
- Add build_httpx_cookies_from_storage() for proper domain-aware cookie jars
- Add fetch_tokens_with_domains() variant that uses original domains
- Add TestCrossDomainCookiePreservation tests for redirect cookie handling

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/test_core.py`:
- Around line 462-490: The test currently only checks cookie presence; to prove
merge (not replace) change test_update_auth_headers_merges_not_replaces to set a
unique sentinel value (e.g., "redirect-only-sentinel") on http_client.cookies
for "SID" (domain=".google.com") before calling core.update_auth_headers(), then
after calling core.update_auth_headers() assert that
http_client.cookies.get("SID", domain=".google.com") == "redirect-only-sentinel"
(and optionally assert other expected tokens still exist). Locate this logic
around NotebookLMClient usage and
core.update_auth_headers()/http_client.cookies.set/get to implement the stronger
equality assertion.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9e59a70d-9893-4a16-bffd-abefc58686b8

📥 Commits

Reviewing files that changed from the base of the PR and between 2990ba2 and 3d90567.

📒 Files selected for processing (3)
  • src/notebooklm/_core.py
  • src/notebooklm/auth.py
  • tests/integration/test_core.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/notebooklm/_core.py
  • src/notebooklm/auth.py

Comment on lines +462 to +490
http_client.cookies.set("SID", "original_sid", domain=".google.com")
http_client.cookies.set("HSID", "original_hsid", domain=".google.com")

# Simulate what happens during a redirect: update_auth_headers merges new cookies
# without wiping existing ones (like refreshed SID from accounts.google.com)
core.update_auth_headers()

# Verify original cookies are still present (not wiped)
# httpx.Cookies.get() returns None if cookie not found
assert http_client.cookies.get("SID", domain=".google.com") is not None
assert http_client.cookies.get("HSID", domain=".google.com") is not None

@pytest.mark.asyncio
async def test_update_auth_headers_merges_not_replaces(self, auth_tokens):
"""Verify update_auth_headers merges new cookies, preserving live redirect cookies."""
async with NotebookLMClient(auth_tokens) as client:
core = client._core
http_client = core._http_client

# Simulate a live cookie received from accounts.google.com redirect
http_client.cookies.set("SID", "refreshed_sid_from_redirect", domain=".google.com")

# Now update auth headers (simulating a token refresh)
core.update_auth_headers()

# The "refreshed" cookie should still be there (merged, not replaced)
sid_cookie = http_client.cookies.get("SID", domain=".google.com")
# The value should be either the original or the refreshed one (merged)
assert sid_cookie is not None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Strengthen assertions to prove merge behavior (not just presence).

Current checks (is not None) are too weak and can still pass if cookies are replaced/overwritten.
Use a redirect-only sentinel cookie and assert exact value/domain survives after update_auth_headers().

Suggested test tightening
@@
-            http_client.cookies.set("SID", "original_sid", domain=".google.com")
-            http_client.cookies.set("HSID", "original_hsid", domain=".google.com")
+            http_client.cookies.set("SID", "original_sid", domain=".google.com")
+            http_client.cookies.set(
+                "REDIRECT_ONLY", "live_cookie", domain="accounts.google.com"
+            )
@@
-            assert http_client.cookies.get("SID", domain=".google.com") is not None
-            assert http_client.cookies.get("HSID", domain=".google.com") is not None
+            assert http_client.cookies.get("SID", domain=".google.com") is not None
+            assert (
+                http_client.cookies.get("REDIRECT_ONLY", domain="accounts.google.com")
+                == "live_cookie"
+            )
@@
-            http_client.cookies.set("SID", "refreshed_sid_from_redirect", domain=".google.com")
+            http_client.cookies.set(
+                "SID", "refreshed_sid_from_redirect", domain="accounts.google.com"
+            )
@@
-            sid_cookie = http_client.cookies.get("SID", domain=".google.com")
-            # The value should be either the original or the refreshed one (merged)
-            assert sid_cookie is not None
+            assert (
+                http_client.cookies.get("SID", domain="accounts.google.com")
+                == "refreshed_sid_from_redirect"
+            )
+            assert http_client.cookies.get("SID", domain=".google.com") is not None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_core.py` around lines 462 - 490, The test currently
only checks cookie presence; to prove merge (not replace) change
test_update_auth_headers_merges_not_replaces to set a unique sentinel value
(e.g., "redirect-only-sentinel") on http_client.cookies for "SID"
(domain=".google.com") before calling core.update_auth_headers(), then after
calling core.update_auth_headers() assert that http_client.cookies.get("SID",
domain=".google.com") == "redirect-only-sentinel" (and optionally assert other
expected tokens still exist). Locate this logic around NotebookLMClient usage
and core.update_auth_headers()/http_client.cookies.set/get to implement the
stronger equality assertion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants