-
Notifications
You must be signed in to change notification settings - Fork 3
Add User-Agent header for URL requests #159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…r redirection of RGD host
📝 WalkthroughWalkthroughA single change updates url_exists in util/lib.py to include a User-Agent header ("OBO Dashboard") in the requests.head call. Control flow and return behavior remain unchanged: True on HTTP 200, False on exceptions, with error logging. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
util/lib.py (1)
572-579: Add timeouts, broaden success criteria, and fallback to GET when HEAD is unsupported.Prevents hangs (addresses Ruff S113), avoids false negatives on non-200 successes, and handles servers that return 405/501 for HEAD.
Apply:
try: - with requests.head(url, allow_redirects=True, headers={"User-Agent": "OBO Dashboard"}) as res: - return (res.status_code == 200) - except Exception as e: - # Any errors with connection will be considered - # as the URL not existing - logging.error(e, exc_info=True) + headers = {"User-Agent": "OBO Dashboard"} + with requests.head(url, allow_redirects=True, headers=headers, timeout=(5, 10)) as res: + # Fallback to GET when HEAD not allowed/implemented + if res.status_code in (405, 501): + with requests.get(url, allow_redirects=True, headers=headers, timeout=(5, 10), stream=True) as res2: + return 200 <= res2.status_code < 400 + return 200 <= res.status_code < 400 + except RequestException as e: + logging.error("url_exists(%s) failed: %s", url, e, exc_info=True) return False
🧹 Nitpick comments (1)
util/lib.py (1)
568-579: Optional: make UA identifiable and reusable.Consider a module-level constant or session (includes version/contact), then reuse everywhere.
Example:
OBO_USER_AGENT = "OBO-Dashboard (+https://github.com/OBOFoundry/OBO-Dashboard)" SESSION = requests.Session() SESSION.headers.update({"User-Agent": OBO_USER_AGENT}) # Then: SESSION.head(...), SESSION.get(...)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
util/lib.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
util/lib.py
573-573: Probable use of requests call without timeout
(S113)
🔇 Additional comments (2)
util/lib.py (2)
573-575: User-Agent addition is correct and aligned with the PR goal.This should resolve the RGD redirect/blank-UA issue while retaining redirects.
270-287: Propagate User-Agent header and timeouts to HTTP callsAdd a User-Agent header and connect/read timeouts to requests.head in base_url_if_exists; apply the same pattern to urllib.request.urlopen wrappers (read_txt_from_url_as_lines, open_yaml_from_url).
- ret = requests.head(ourl, allow_redirects=True) + ret = requests.head(ourl, allow_redirects=True, headers={"User-Agent": "OBO Dashboard"}, timeout=(5, 10))Repo-wide search was inconclusive (ripgrep reported "No files were searched"); verify other requests.(head|get) occurrences and add the same headers/timeouts.
Include a User-Agent header in the
url_existsfunction to support proper redirection for RGD hosts.Fixes #158
Summary by CodeRabbit