You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**PyScrappy** is a Python package that provides a fast, flexible, and exhaustive way to scrape data from various different sources. Being an
28
27
easy and intuitive library. It aims to be the fundamental high-level building block for scraping **data** in Python. Additionally, it has the broader goal of becoming **the most powerful and flexible open source data scraping tool available**.
29
28
30
29
## Main Features
30
+
31
31
Here are just a few of the things that PyScrappy does well:
32
32
33
-
- Easy scraping of [**Data**](https://medium.com/analytics-vidhya/web-scraping-in-python-using-the-all-new-pyscrappy-5c136ed6906b) available on the internet
34
-
- Returns a [**DataFrame**](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) for further analysis and research purposes.
35
-
- Automatic [**Data Scraping**](https://medium.com/analytics-vidhya/web-scraping-in-python-using-the-all-new-pyscrappy-5c136ed6906b): Other than a few user input parameters the whole process of scraping the data is automatic.
36
-
- Powerful, flexible
33
+
- Easy scraping of [**Data**](https://medium.com/analytics-vidhya/web-scraping-in-python-using-the-all-new-pyscrappy-5c136ed6906b) available on the internet
34
+
- Returns a [**DataFrame**](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) for further analysis and research purposes.
35
+
- Automatic [**Data Scraping**](https://medium.com/analytics-vidhya/web-scraping-in-python-using-the-all-new-pyscrappy-5c136ed6906b): Other than a few user input parameters the whole process of scraping the data is automatic.
36
+
- Powerful, flexible
37
37
38
38
## Where to get it
39
+
39
40
The source code is currently hosted on GitHub at:
40
41
https://github.com/mldsveda/PyScrappy
41
42
@@ -47,13 +48,14 @@ pip install PyScrappy
47
48
```
48
49
49
50
## Dependencies
50
-
-[selenium - Selenium is a free (open-source) automated testing framework used to validate web applications across different browsers and platforms.](https://www.selenium.dev/)
51
-
-[webdriver-manger - WebDriverManager is an API that allows users to automate the handling of driver executables like chromedriver.exe, geckodriver.exe etc required by Selenium WebDriver API. Now let us see, how can we set path for driver executables for different browsers like Chrome, Firefox etc.](https://github.com/bonigarcia/webdrivermanager)
52
-
-[beautifulsoup4 - Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
53
-
-[pandas - Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.](https://pandas.pydata.org/)
54
51
52
+
-[selenium](https://www.selenium.dev/) - Selenium is a free (open-source) automated testing framework used to validate web applications across different browsers and platforms.
53
+
-[webdriver-manger](https://github.com/bonigarcia/webdrivermanager) - WebDriverManager is an API that allows users to automate the handling of driver executables like chromedriver.exe, geckodriver.exe etc required by Selenium WebDriver API. Now let us see, how can we set path for driver executables for different browsers like Chrome, Firefox etc.
54
+
-[beautifulsoup4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) - Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.
55
+
-[pandas](https://pandas.pydata.org/) - Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
@@ -62,16 +64,19 @@ For usage questions, the best place to go to is [StackOverflow](https://stackove
62
64
Further, general questions and discussions can also take place on GitHub in this [repository](https://github.com/mldsveda/PyScrappy).
63
65
64
66
## Discussion and Development
67
+
65
68
Most development discussions take place on GitHub in this [repository](https://github.com/mldsveda/PyScrappy).
66
69
67
70
Also visit the official documentation of [PyScrappy](https://pyscrappy.netlify.app/) for more information.
68
71
69
72
## Contributing to PyScrappy
73
+
70
74
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.
71
75
72
-
If you are simply looking to start working with the PyScrappy codebase, navigate to the [GitHub "issues" tab](https://github.com/mldsveda/PyScrappy/issues) and start looking through interesting issues.
76
+
If you are simply looking to start working with the PyScrappy codebase, navigate to the GitHub ["issues"](https://github.com/mldsveda/PyScrappy/issues) tab and start looking through interesting issues.
73
77
74
78
## End Notes
75
-
*Learn More about this package on [Medium](https://medium.com/analytics-vidhya/web-scraping-in-python-using-the-all-new-pyscrappy-5c136ed6906b).*
76
79
77
-
### ***This package is solely made for educational and research purposes.***
80
+
_Learn More about this package on [Medium](https://medium.com/analytics-vidhya/web-scraping-in-python-using-the-all-new-pyscrappy-5c136ed6906b)._
81
+
82
+
### **_This package is solely made for educational and research purposes._**
0 commit comments