[GUIDE] Bypass amazon detection each 5000 tries with change user agent method

Dear guys,

Thanks for sharing your code - Author, philipperemy. It's helpful for my data science hobby atm.
Here is how to bypass detection of amazon

**### In: core_utils.py:**
**Import fake user agent**
```
def get_soup_retry(url):
    from fake_useragent import UserAgent
    ua = UserAgent()
    UserAGR = ua.random
    if AMAZON_BASE_URL not in url:
        url = AMAZON_BASE_URL + url
    nap_time_sec = 1
    logging.debug('Script is going to sleep for {} (Amazon throttling). ZZZzzzZZZzz.'.format(nap_time_sec))
    sleep(nap_time_sec)
    
    header = {
        'User-Agent': UserAGR
    }
    logging.debug('-> to Amazon : {}'.format(url))
    isCaptcha = True
    while isCaptcha==True:
        out = requests.get(url, headers=header)
        assert out.status_code == 200
        soup = BeautifulSoup(out.content, 'lxml')
        if 'captcha' in str(soup):
            UserAGR = ua.random
            print('Bot has been detected... retrying ... use new identity: ', UserAGR)
            isCaptcha=True
        else:
            UserAGR = ua.random
            print('Bot bypassed')
            isCaptcha=False
            return soup


def get_soup(url):
    soup = get_soup_retry(url)
    return soup
```

Well it's simply go through with many tries :)
Good luck!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[GUIDE] Bypass amazon detection each 5000 tries with change user agent method #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[GUIDE] Bypass amazon detection each 5000 tries with change user agent method #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions