Skip to content
@scrapinghub

Scrapinghub

Turn web content into useful data

Pinned Loading

  1. splash splash Public

    Lightweight, scriptable browser as a service with an HTTP API

    Python 4.2k 517

  2. dateparser dateparser Public

    python parser for human readable dates

    Python 2.7k 480

  3. python-scrapinghub python-scrapinghub Public

    A client interface for Scrapinghub's API

    Python 204 61

  4. extruct extruct Public

    Extract embedded metadata from HTML markup

    Python 931 118

  5. spidermon spidermon Public

    Scrapy Extension for monitoring spiders execution.

    Python 546 100

  6. python-crfsuite python-crfsuite Public

    A python binding for crfsuite

    Python 772 222

Repositories

Showing 10 of 183 repositories
  • docker-images Public
    scrapinghub/docker-images’s past year of commit activity
    Dockerfile 33 8 0 4 Updated Oct 10, 2025
  • shub-workflow Public
    scrapinghub/shub-workflow’s past year of commit activity
    Python 15 BSD-3-Clause 14 2 1 Updated Oct 6, 2025
  • price-parser Public

    Extract price amount and currency symbol from a raw text string

    scrapinghub/price-parser’s past year of commit activity
    Python 340 BSD-3-Clause 51 17 (4 issues need help) 9 Updated Oct 6, 2025
  • web-poet Public

    Web scraping Page Objects core library

    scrapinghub/web-poet’s past year of commit activity
    Python 101 BSD-3-Clause 16 17 (1 issue needs help) 14 Updated Oct 3, 2025
  • andi Public

    Library for annotation-based dependency injection

    scrapinghub/andi’s past year of commit activity
    Python 23 BSD-3-Clause 6 4 1 Updated Oct 3, 2025
  • python-scrapinghub Public

    A client interface for Scrapinghub's API

    scrapinghub/python-scrapinghub’s past year of commit activity
    Python 204 BSD-3-Clause 61 23 2 Updated Oct 3, 2025
  • extruct Public

    Extract embedded metadata from HTML markup

    scrapinghub/extruct’s past year of commit activity
    Python 931 BSD-3-Clause 118 39 (1 issue needs help) 15 Updated Oct 1, 2025
  • scrapy-poet Public

    Page Object pattern for Scrapy

    scrapinghub/scrapy-poet’s past year of commit activity
    Python 121 BSD-3-Clause 28 14 (1 issue needs help) 6 Updated Sep 25, 2025
  • article-extraction-benchmark Public

    Article extraction benchmark: dataset and evaluation scripts

    scrapinghub/article-extraction-benchmark’s past year of commit activity
    Python 332 MIT 31 1 1 Updated Sep 23, 2025
  • scrapyrt Public

    HTTP API for Scrapy spiders

    scrapinghub/scrapyrt’s past year of commit activity
    Python 870 BSD-3-Clause 162 24 (2 issues need help) 6 Updated Sep 22, 2025