Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Jacquline Scraper Solution
A lightweight web scraper built with Kotlin, Spring Boot, and Selenium to extract result details from an HTML page.
✨ Why This Stack?
For selecting elements - i used CSS Selectors - flexible element targeting
Example:
To wait for page rendering, I used:
Headless Mode
I've opted to use headless mode in Selenium because:
It’s ideal for automated testing environments — no GUI required.
It makes Dockerization smooth and lightweight.
It runs faster and consumes fewer resources than full-browser mode.
Perfect for CI/CD pipelines or headless environments (like cloud servers).
You can find the setup in the ChromeOptions block:
How to Run
Prerequisites
Chromedriver location -
/usr/local/bin/chromedriver
Can be changed from the main class
Build the project
Run it
This exposes an API on port 8080:
This can be changed at the properties page -
Kotlin-Project/src/main/resources/application.properties
server.port=8080
project.setup1.mp4
Run with Docker (WIP)
Using the API
Send a GET request to:
You can also test with curl:
curl "http://localhost:8080/api?path=../files/test.html"
The
path
is the relative file path inside the../files/
directory.If you dont pass the file path, it will default to van-gogh-paintings.html
getResults(@RequestParam(defaultValue = "../files/van-gogh-paintings.html")
VIDEO DEMO - API USAGE -
API.mp4
✅ Example Response
🧪 Running Tests
The assumption is we are supporting pages that have the same structure as the one provided in the example.
So i have created a sample page mimicing the structure and called it
test.html
- This is what i am using to test./gradlew test
Includes tests for:
VIDEO DEMO - TESTS-
tests.mp4
📁 Project Structure