CrowlerGo

CrowlerGo is a high-performance, concurrent web crawler written in Go. It is designed to efficiently map website structures, discover subpages, and export results in various formats.

Features

Concurrent Crawling: Utilizes Go routines for fast and efficient crawling.
Configurable Concurrency: detailed control over the number of concurrent workers.
Depth Control: Limit the depth of the crawl to avoid going too deep.
Result Limiting: Stop after a specified number of results.
Output Formats: Supports CSV (with discovery paths) and TXT formats.
Resumable/Incremental: Can load previously visited URLs to avoid re-crawling or valid new discovery only.
Path Tracking: Optionally tracks and visually represents the discovery path for each URL.

Usage

Run the crawler by providing a starting URL:

./crowlergo <url> [options]

Example

# Basic crawl
./crowlergo https://example.com

# Crawl with specific limits and output
./crowlergo https://example.com -depth 3 -limit 500 -output my_results.csv

# High concurrency crawl
./crowlergo https://example.com -concurrency 200

Options

Flag	Type	Default	Description
`-output`	string	`results.csv`	Path to the output file.
`-format`	string	`csv`	Output format: `txt` or `csv`.
`-depth`	int	`5`	Maximum crawling depth.
`-limit`	int	`1000`	Maximum number of results to collect.
`-concurrency`	int	`100`	Number of concurrent workers.
`-subpages`	bool	`false`	Include all subpages (full URLs) in results, not just unique hosts.
`-input`	string	`""`	File containing already visited URLs to seed the crawler.
`-new-only`	bool	`false`	If true, only saves newly discovered URLs (doesn't merge with input file).
`-no-path`	bool	`false`	Disable discovery path tracking (saves memory/performance).

Build

To build the project for your current platform:

./build.sh

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
cmd/crowler		cmd/crowler
internal		internal
.gitignore		.gitignore
README.md		README.md
build.sh		build.sh
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CrowlerGo

Features

Usage

Example

Options

Build

About

Uh oh!

Languages

SrDouglax/crowlergo

Folders and files

Latest commit

History

Repository files navigation

CrowlerGo

Features

Usage

Example

Options

Build

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages