Skip to content

SSl Proxy Support #9

Open
Open
@Chetan11-dev

Description

@Chetan11-dev

Hi, I have created a package named botasaurus-proxy-authentication, which enables SSL support for proxies requiring authentication.

For instance, when using an authenticated proxy with a tool like seleniumwire to scrape a Cloudflare-protected website such as G2.com, a non-SSL connection typically results in being blocked.

To illustrate, run this code:

First, install the required packages:

python -m pip install selenium_wire chromedriver_autoinstaller

Then, execute this Python script:

from seleniumwire import webdriver
from chromedriver_autoinstaller import install

# Define the proxy
proxy_options = {
    'proxy': {
        'http': 'http://username:password@proxy-provider-domain:port', # Replace with your proxy
        'https': 'http://username:password@proxy-provider-domain:port', # Replace with your proxy
    }
}

# Install and set up the driver
driver_path = install()
driver = webdriver.Chrome(driver_path, seleniumwire_options=proxy_options)

# Navigate to the desired URL
link = 'https://www.g2.com/products/github/reviews'
driver.get("https://www.google.com/")
driver.execute_script(f'window.location.href = "{link}"')

# Wait for user input
input("Press Enter to exit...")

# Clean up
driver.quit()

You'll likely be blocked by Cloudflare:

blocked

First, install the required packages:

python -m pip install botasaurus-proxy-authentication

However, using botasaurus_proxy_authentication with proxies circumvents this problem. Notice the difference by running the following code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from chromedriver_autoinstaller import install
from botasaurus_proxy_authentication import add_proxy_options

# Define the proxy settings
proxy = 'http://username:password@proxy-provider-domain:port'  # Replace with your proxy

# Set Chrome options
chrome_options = Options()
add_proxy_options(chrome_options, proxy)

# Install and set up the driver
driver_path = install()
driver = webdriver.Chrome(driver_path, options=chrome_options)

# Navigate to the desired URL
link = 'https://www.g2.com/products/github/reviews'
driver.get("https://www.google.com/")
driver.execute_script(f'window.location.href = "{link}"')

# Wait for user input
input("Press Enter to exit...")

# Clean up
driver.quit()

Result:
not blocked

I suggest using botasaurus_proxy_authentication for its SSL support for authenticated proxies, improving the success rate of scraping Cloudflare-protected websites and thus increasing revenue for Oxylabs.
Also, Thanks Oxylabs for your Great Work in Proxy.
Good Luck to the Team.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions