🚀 Job Scrapper & Cover Letter Generator

An automated data pipeline designed to streamline the job application process. This tool scrapes job listings from LinkedIn, processes the extracted data, and automatically generates tailored cover letters to give you a competitive edge in your job search.

✨ Features

Automated Web Scraping: Extracts job postings, titles, locations, and posting dates directly from LinkedIn.
Data Processing & Analysis: Cleans and analyzes scraped HTML data to identify key job requirements and details.
Dynamic Cover Letter Generation: Automatically crafts personalized cover letters based on the extracted job data.
Interactive Notebooks: Includes a Jupyter Notebook environment for data exploration and testing.

⚠️ Disclaimer

Warning

CRITICAL WARNING: You should not run this program on your local machine while signed in to your LinkedIn account on your web browser. Automated scraping can trigger security flags, which may result in your personal account and IP address being banned. Instead, it is highly recommended to run this anonymously through the Google Colab notebook.

This tool interacts with LinkedIn's front-end HTML elements (e.g., results-context-header, base-search-card). Web structures change frequently, which may require you to update the HTML tags in the scraping script. Furthermore, please be mindful of LinkedIn's Terms of Service regarding automated scraping and use this tool responsibly.

Note

⏱️ Execution Time & Rate Limiting Note: To mimic human behavior and avoid attracting attention from anti-bot protections, this program is intentionally designed to run slowly.

Execution Time: Depending on how many job postings are being scraped, it can take over 5 minutes to run to completion. You can adjust this speed in the code by modifying the wait_seconds = 2 variable.
Scraping Limits: The script includes a limiter for how many postings can be scraped per run to prevent account flagging. Upon running, it will prompt you via user input to ask how many jobs you want to scrape (the current default limit is set to 100).

📁 Repository Structure

Job_Scrapper/
├── letters/                   # Directory containing generated cover letters
├── processed_data/            # Cleaned and structured data ready for analysis
├── scrapped_data/             # Raw HTML and JSON data scraped from LinkedIn
├── Job_Scapper_Ext.ipynb      # Interactive Jupyter Notebook for extended analysis
├── analyze_data.py            # Scripts for analyzing processed job data
├── extract_data.py            # Extracts targeted information from raw HTML 
├── generate_letters.py        # Logic for drafting tailored cover letters
├── main.py                    # Main execution script to run the full pipeline
├── scrape_data.py             # Web scraping logic utilizing LinkedIn HTML tags
└── README.md                  # Project documentation

🛠️ Installation & Setup

Option 1: Google Colab (Recommended)

To avoid local setup and protect your personal IP and accounts, you can run this scraper entirely in the cloud: Run in Google Colab

Option 2: Local Setup

Clone the repository:

git clone [https://github.com/jgarvey928/Job_Scrapper.git](https://github.com/jgarvey928/Job_Scrapper.git)

cd Job_Scrapper

Set up a virtual environment (recommended):

python -m venv venv

source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install dependencies: (Note: Ensure you have a requirements.txt file, or install the necessary scraping/data libraries like BeautifulSoup4, pandas, requests, etc.)
```
pip install -r requirements.txt
```

🚀 Usage

To run the complete pipeline from scraping to cover letter generation, execute the main script:

python main.py

Step-by-Step Execution: If you prefer to run the modules individually:

Run python scrape_data.py to fetch the latest job postings.
Run python extract_data.py to parse the raw data into the processed_data/ folder.
Run python analyze_data.py to gain insights into the job market.
Run python generate_letters.py to output tailored documents into the letters/ folder.

You can also open Job_Scapper_Ext.ipynb in Jupyter Notebook or Google Colab for an interactive walk-through of the data.

👨‍💻 Author

John S. Garvey

GitHub: @jgarvey928
LinkedIn: John S. Garvey
Portfolio: My Portfolio

If you find this project helpful, please consider giving it a ⭐!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Job Scrapper & Cover Letter Generator

✨ Features

⚠️ Disclaimer

📁 Repository Structure

🛠️ Installation & Setup

Option 1: Google Colab (Recommended)

Option 2: Local Setup

🚀 Usage

👨‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
__pycache__		__pycache__
letters		letters
processed_data		processed_data
scrapped_data		scrapped_data
Job_Scapper_Ext.ipynb		Job_Scapper_Ext.ipynb
README.md		README.md
analyze_data.py		analyze_data.py
extract_data.py		extract_data.py
generate_letters.py		generate_letters.py
main.py		main.py
scrape_data.py		scrape_data.py

Folders and files

Latest commit

History

Repository files navigation

🚀 Job Scrapper & Cover Letter Generator

✨ Features

⚠️ Disclaimer

📁 Repository Structure

🛠️ Installation & Setup

Option 1: Google Colab (Recommended)

Option 2: Local Setup

🚀 Usage

👨‍💻 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages