Maximum Coverage URL Crawler for Bug Bounty & Security Research
A powerful bash script that aggregates URLs from 20+ sources including web archives, API endpoints, crawlers, and intelligence platforms. Perfect for reconnaissance, bug bounty hunting, and security assessments.
- 20+ Data Sources: Wayback Machine, Common Crawl, URLScan.io, AlienVault OTX, VirusTotal, and many more
- Multi-API Integration: Leverages paid and free APIs for comprehensive coverage
- Intelligent Crawling: Combines passive and active reconnaissance techniques
- Historical Data: Fetches complete archive history from Wayback Machine (all years)
- Rate Limit Handling: Built-in retry logic and rate limit management
- Progress Tracking: Real-time status updates with color-coded output
- Bulk Processing: Process multiple domains from a file
- Smart Filtering: Removes duplicates, static files, and social media noise
sudo apt install jq curl# Waybackurls
go install github.com/tomnomnom/waybackurls@latest
# GAU (GetAllUrls)
go install github.com/lc/gau/v2/cmd/gau@latest
# Hakrawler
go install github.com/hakluke/hakrawler@latest
# Katana
go install github.com/projectdiscovery/katana/cmd/katana@latest
# GoSpider
go install github.com/jaeles-project/gospider@latest
# ParamSpider
git clone https://github.com/devanshbatham/ParamSpider
cd ParamSpider
pip install -r requirements.txtThe script supports multiple API providers. Edit the script header and replace the placeholder text with your actual API keys:
export VIRUSTOTAL_API_KEY="add your virustotal key"
export SECURITYTRAILS_API_KEY="add your security trails key"
export GITHUB_TOKEN="add your github token"
export CHAOS_API_KEY="add your chaos key"
export ALIENVAULT_API_KEY="add your alienvault key"
export URLSCAN_API_KEY="add your urlscan key"
export SHODAN_API_KEY="add your shodan key"
export CENSYS_API_ID="add your censys api id"
export CENSYS_API_SECRET="add your censys api secret"
export GOOGLE_API_KEY="add your google api key"
export GOOGLE_CSE_ID="add your google cse id"
export TRELLO_API_KEY="add your trello api key"
export TRELLO_TOKEN="add your trello token"
export INTELX_API_KEY="add your intelx key"- AlienVault OTX: https://otx.alienvault.com/ (Free)
- URLScan.io: https://urlscan.io/about/api (Free tier available)
- VirusTotal: https://www.virustotal.com/gui/join-us (Free)
- SecurityTrails: https://securitytrails.com/corp/api (Free tier)
- Shodan: https://account.shodan.io/ (Paid)
- GitHub: https://github.com/settings/tokens (Free)
- Censys: https://search.censys.io/account/api (Free tier)
- Intelligence X: https://intelx.io/signup (Free tier)
./rocket-crawl.sh geturls example.comCreate a file with one domain per line:
# domains.txt
example.com
subdomain1.example.com
subdomain2.example.comThen run:
./rocket-crawl.sh getsuburls domains.txtThe script generates:
urls_<domain>.txt- URLs for individual domainsall_urls.txt- Consolidated results for bulk processing
==========================================
[β] CRAWL COMPLETE!
==========================================
Domain: example.com
Total URLs: 45,892
Time taken: 287s
Output file: urls_example.com.txt
==========================================
- Wayback Machine (complete history)
- Archive.today
- Common Crawl (5 latest indexes)
- AlienVault OTX (paginated)
- URLScan.io (full pagination up to 100k results)
- VirusTotal (URLs, subdomains, communicating files)
- SecurityTrails (DNS + history)
- Shodan (DNS + search)
- Intelligence X
- Certificate Transparency (crt.sh, Censys)
- Waybackurls
- GAU (GetAllUrls)
- Hakrawler
- Katana
- GoSpider
- ParamSpider
- Waymore
- GitHub (code + gists)
- DNS Dumpster
- Robots.txt & Sitemaps
- Trello public boards
- Paste sites (Pastebin, etc.)
All HTTP requests have built-in timeouts:
- Standard requests: 180 seconds
- Archive requests: 300 seconds (due to large datasets)
- Connection timeout: 30 seconds
The script automatically filters out:
- Static files (images, fonts, media)
- Common social media platforms
- Duplicate URLs
- Non-target domain URLs
Before contributing or sharing:
- Remove all API keys from the script header
- Never commit API keys to version control
- Use environment variables for sensitive data
- Consider using a
.envfile for local development
# Create a .env file (add to .gitignore)
cat > .env << EOF
export ALIENVAULT_API_KEY="your_key"
export URLSCAN_API_KEY="your_key"
# ... other keys
EOF
# Load before running
source .env
./rocket-crawl.sh geturls example.comContributions are welcome! Areas for improvement:
- Additional data sources
- Performance optimizations
- Better error handling
- Output format options (JSON, CSV)
- Proxy support
Install jq and curl: sudo apt install jq curl
Wait for the cooldown period or reduce concurrent requests
Verify your API keys are set correctly and have valid quotas
- Reduce crawling depth in active crawlers
- Disable non-essential data sources
- Use multiple domains in parallel (separate terminal windows)
For issues, questions, or feature requests, please open an issue on GitHub.
Happy Hunting! π―