Skip to content

HimanshuRanka/SPAWebScrapers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SPAWebScrapers

A web scraping project I undertook to help with my work as a Student Project Assistant at McGill Libraries in thier Digital Initiatives branch.

Libraries used

  • BeautifulSoup
pip install beautifulsoup4
  • xlwings
pip install xlwings
  • requests
pip install requests

Sherpa Scraper

file name: SherpaScraper.py

Script for a personal UDF for Excel.

This is a scraper created to scrape embargo, ISSN and publication information from the sherpa/romeo website.

It successfully extracts and puts the information u need into your excel sheet by column

  • ISSN
  • Publisher
  • version[pathway]: embargo information

Running with xlwings

To run the script on excel, you need to make sure your Excel workbook is macros enabled. You also need to make sure xlwings is enabled in your VBA environment.

Everytime you implement a new script/edit an old script, you need to hit import UDF's before using it as a function in excel.

Refer the documentation for more details.

About

A web scraping project I undertook to help with my work as a Student Project Assistant at McGill Libraries in thier Digital Initiatives branch..

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages