Skip to content

Support HTTP Proxy on BookWorm#10310

Merged
mekarpeles merged 2 commits intointernetarchive:masterfrom
scottbarnes:add-bookworm-proxy-support
Jan 10, 2025
Merged

Support HTTP Proxy on BookWorm#10310
mekarpeles merged 2 commits intointernetarchive:masterfrom
scottbarnes:add-bookworm-proxy-support

Conversation

@scottbarnes
Copy link
Collaborator

@scottbarnes scottbarnes commented Jan 10, 2025

Closes #10230

This commit allows BookWorm to use the HTTP proxy when fetching metadata from Amazon.

BookWorm relies on an Amazon API client that itself relies on urllib3 which explicitly does not and will not support the HTTP_PROXY environment variable.

urllib3 does support HTTP Proxies, but it needs a specific object, and the underlying code for the API does support all of this, but the library provides no way to pass the proxy value through to urllib3.

The workaround here is to create the Configuration object in the Amazon API client, add a proxy object after init, then use that to create a REST client, and then replace the original api_client object with this one, so that there is proxy support.

Note: if trying to create an vendors.AmazonAPI object in a REPL, one will need the 'magic incantation' with infogami._setup() after the config has been loaded, or the http_proxy_url value will be None.

Testing

This isn't easy to test, but the following will work from ol-home0, inside the testing affiliate server container, as this code is currently applied there (once amz_key and amz_secret have the relevant values:

import infogami
from openlibrary.config import load_config
load_config('/olsystem/etc/openlibrary.yml')
infogami._setup()
import os
amz_key=os.getenv("amz_key")
amz_secret=os.getenv("amz_secret")
from openlibrary.core.vendors import AmazonAPI
amazon = AmazonAPI(key=amz_key, secret=amz_secret, tag="internetarchi-20")
amazon.get_product("9351950972")

Without this PR, request for amazon.get_product() will simply time out.

NOTE: this does NOT fix the /isbn endpoint. That still is not working for an unknown reason, and is the next follow up to this.

Stakeholders

@mekarpeles

Closes internetarchive#10230

This commit allows BookWorm to use the HTTP proxy when fetching metadata
from Amazon.

BookWorm relies on an Amazon API client that itself relies on `urllib3`
which explictly does not and will not support the HTTP_PROXY environment
variable.

`urllib3` does support HTTP Proxies, but it needs a specific object, and
the underlying code for the API does support all of this, but the
library provides no way to pass the proxy value through to `urllib3`.

The workaround here is to create the `Configuration` object in the
Amazon API client, add a proxy object after init, then use that to
create a REST client, and then replace the original `api_client` object
with this one, so that there is proxy support.

Note: if trying to create an `vendors.AmazonAPI` object in a REPL, one
will need the 'magic incantation' with `infogami._setup()` after the
config has been loaded, or the `http_proxy_url` value will be `None`.
@github-actions github-actions bot added the Priority: 2 Important, as time permits. [managed] label Jan 10, 2025
"""
Creates an instance containing your API credentials.

Instantiating this object in a REPL requires the `infogami._setup()`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this tip :)

@mekarpeles mekarpeles merged commit 16fb67e into internetarchive:master Jan 10, 2025
4 checks passed
@scottbarnes scottbarnes deleted the add-bookworm-proxy-support branch January 10, 2025 14:45
scottbarnes added a commit to scottbarnes/openlibrary that referenced this pull request Jan 12, 2025
* Support HTTP Proxy on BookWorm

Closes internetarchive#10230

This commit allows BookWorm to use the HTTP proxy when fetching metadata
from Amazon.

BookWorm relies on an Amazon API client that itself relies on `urllib3`
which explictly does not and will not support the HTTP_PROXY environment
variable.

`urllib3` does support HTTP Proxies, but it needs a specific object, and
the underlying code for the API does support all of this, but the
library provides no way to pass the proxy value through to `urllib3`.

The workaround here is to create the `Configuration` object in the
Amazon API client, add a proxy object after init, then use that to
create a REST client, and then replace the original `api_client` object
with this one, so that there is proxy support.

Note: if trying to create an `vendors.AmazonAPI` object in a REPL, one
will need the 'magic incantation' with `infogami._setup()` after the
config has been loaded, or the `http_proxy_url` value will be `None`.

---------

Co-authored-by: Scott Barnes <scott.barnes@archive.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Priority: 2 Important, as time permits. [managed]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Permit webservices.amazon.com via 44.215.138.164 to avoid blockage

2 participants