Skip to content

How to set UserAgent in Rust? #271

@KiithNabaal

Description

@KiithNabaal

I saw this issue here: #184

add_chrome_arg doesn't seem to be in the API anymore for ChromeCapabilities, as I get an error if I try to use it (no method named add_chrome_arg found for struct ChromeCapabilities in the current scope). I also don't see it in the docs when I search for it.

I have tried other ways of setting the UserAgent and I don't really know if it is working. I am getting very different behavior from setting the UserAgent in Selenium ChromeDriver than in thirtyfour.

Source code:

use thirtyfour::prelude::*;
use std::{collections::HashMap, time::Duration};
pub use thirtyfour::{By, WebDriver};
use anyhow::Result;

#[tokio::main]
async fn main() -> Result<()> {
    let mut caps = DesiredCapabilities::chrome();
    caps.set_headless()?;
    caps.set_disable_web_security()?;
    caps.set_ignore_certificate_errors()?;
    caps.set_no_sandbox()?;
    caps.set_disable_gpu()?;
    caps.set_disable_dev_shm_usage()?;
    caps.add_arg("--hide-scrollbars")?;
    caps.add_arg("--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0 Safari/537.36")?;

    let server_url = "http://localhost:4444".to_string();
    println!("Configuring selenium: {}", server_url);

    let driver = WebDriver::new(&server_url, caps).await?;
    println!("Connected to Selenium Grid");

    let timeouts = TimeoutConfiguration::new(
        Some(Duration::from_secs(60)),
        Some(Duration::from_secs(60)),
        Some(Duration::from_secs(60)),
    );
    driver.update_timeouts(timeouts).await?;
    println!("Configured selenium");

    let _ = driver.get("https://www.pnc.com/en/personal-banking.html");

    println!("Navigated to PNC");
    let source = driver.source().await?;
    println!("We got the source");
    println!("The result is {source}");

    driver.quit().await?;

    Ok(())
}

It connects to Selenium fine most of the time, but when I connect to PNC I get this result:

<html><head></head><body></body></html>

Which is just an empty page. When I set the UserAgent in Selenium ChromeDriver, I get the actual page source.

Some sites which I know for a fact block access based on UserAgents with HeadlessChrome in them, will give me source code here when I use thirtyfour even if I don't set the UserAgent which is a bit strange.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions