Google Scholar CAPTCHA appears on first profile using RSelenium

Hello and thank you for maintaining this package.

I'm using `RSelenium` in R to scrape metadata from Google Scholar profiles (e.g., citations, h-index, publication count). My goal is to process a list of ~700 profile URLs, visiting each one to extract this public data.

However, the automation is blocked immediately: the CAPTCHA ("I'm not a robot") challenge appears already when opening the **first profile**, preventing any further interaction via Selenium.

### My questions are:

1. Are there any recommended approaches in `RSelenium` to deal with CAPTCHA challenges like this?
2. Are techniques like rotating proxies, randomized delays, or custom user-agents compatible with `RSelenium` in R?
3. Do you recommend any browser configuration (e.g., Chrome vs. Firefox) or headless mode settings to reduce the risk of blocks when scraping public pages like Scholar profiles?

I’m currently using Firefox with this setup:

# Clear environment
rm(list = ls())

# Load packages
library(pacman)
p_load("tidyverse", "readxl", "RSelenium", "httpuv", "wdman")

# Read list of URLs
investigadores <- read_excel("0- Datos brutos/Investigadores.xlsx")
total_perfiles <- nrow(investigadores)

# Start RSelenium
puerto_libre <- httpuv::randomPort()
driver <- RSelenium::rsDriver(
  port = puerto_libre,
  browser = "firefox",
  version = "latest"
)
remote_driver <- driver[["client"]]


Any suggestions or best practices to avoid immediate CAPTCHA blocks would be highly appreciated. If useful, I can share the full code I'm using.

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Google Scholar CAPTCHA appears on first profile using RSelenium #296

My questions are:

Clear environment

Load packages

Read list of URLs

Start RSelenium

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Google Scholar CAPTCHA appears on first profile using RSelenium #296

Description

My questions are:

Clear environment

Load packages

Read list of URLs

Start RSelenium

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions