Skip to content

Latest commit

 

History

History
38 lines (28 loc) · 631 Bytes

File metadata and controls

38 lines (28 loc) · 631 Bytes

Haskell Web Crawler

A super basic web crawler prototype in Haskell.

Features

  • Fetches web pages via HTTP
  • Extracts links from HTML
  • Tracks visited URLs to avoid duplicates
  • Depth-limited crawling

Build & Run

cabal build
cabal run

Or with Stack:

stack build
stack run

Usage

When you run the program, it will prompt for a starting URL. Enter any valid HTTP/HTTPS URL and it will crawl up to depth 2 (configurable in Main.hs).

Example:

Enter starting URL:
https://example.com

Dependencies

  • http-conduit: HTTP client
  • tagsoup: HTML parsing
  • containers: Set data structure