Skip to content

Mohammadrce/ProxyAtlas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ProxyAtlas

CI Release License: MIT

ProxyAtlas is a high-performance Go CLI toolkit for collecting and validating free proxies with a strict-first pipeline and adaptive fallback when strict results are empty.

What It Is

ProxyAtlas has two binaries:

  • proxyharvest: collects proxies from curated public sources, normalizes/dedupes them, scores sources, and runs fast prefilter checks.
  • proxycheck: runs strict health/anonymity/stability validation, produces ranked diagnostics, and can auto-run adaptive fallback.

Architecture

  1. Ingestion: raw_text and json_api source adapters.
  2. Normalization: canonical scheme://host:port with protocol validation.
  3. Source reliability: health tracking, fail-threshold gating, cooldown skips.
  4. Prefilter: fast multi-target checks (>=1 pass by default).
  5. Strict check: multi-target + latency + anonymity + stability retries.
  6. Adaptive fallback: triggered when strict healthy count is zero (or forced with --mode adaptive).
  7. Output: JSONL + TXT + report JSON.

Install

go build -o bin/proxyharvest ./cmd/proxyharvest
go build -o bin/proxycheck ./cmd/proxycheck

Quick Start

PowerShell

.\bin\proxyharvest.exe `
  --protocols http,https,socks4,socks5 `
  --max-collect 50000 `
  --max-per-protocol 15000 `
  --fetch-workers 64 `
  --source-timeout 12s `
  --source-fail-threshold 3 `
  --source-cooldown 12h `
  --prefilter-profile fast `
  --prefilter-min-pass 1 `
  --prefilter-timeout 1800ms `
  --prefilter-workers 200 `
  --sources-file configs/sources.json `
  --targets-file configs/targets.json `
  --out-jsonl data/harvest/latest.jsonl `
  --out-txt data/harvest/latest.txt

.\bin\proxycheck.exe `
  --mode strict `
  --in-jsonl data/harvest/latest.jsonl `
  --in-txt data/harvest/latest.txt `
  --workers 300 `
  --max-eval 2000 `
  --connect-timeout 1800ms `
  --request-timeout 4500ms `
  --min-pass 2 `
  --max-latency 4500ms `
  --stability-retries 2 `
  --stability-gap 2s `
  --adaptive-min-pass 1 `
  --adaptive-max-latency 12s `
  --adaptive-no-stability `
  --targets-profile resilient `
  --targets-file configs/targets.json `
  --out-jsonl data/check/latest.jsonl `
  --out-txt data/check/healthy.txt `
  --out-report data/check/report.json `
  --adaptive-out-jsonl data/check/latest_adaptive.jsonl `
  --adaptive-out-txt data/check/healthy_adaptive.txt `
  --adaptive-out-report data/check/report_adaptive.json

Linux/macOS

./bin/proxyharvest \
  --protocols http,https,socks4,socks5 \
  --max-collect 50000 \
  --max-per-protocol 15000 \
  --fetch-workers 64 \
  --source-timeout 12s \
  --source-fail-threshold 3 \
  --source-cooldown 12h \
  --prefilter-profile fast \
  --prefilter-min-pass 1 \
  --prefilter-timeout 1800ms \
  --prefilter-workers 200 \
  --sources-file configs/sources.json \
  --targets-file configs/targets.json \
  --out-jsonl data/harvest/latest.jsonl \
  --out-txt data/harvest/latest.txt

./bin/proxycheck \
  --mode strict \
  --in-jsonl data/harvest/latest.jsonl \
  --in-txt data/harvest/latest.txt \
  --workers 300 \
  --max-eval 2000 \
  --connect-timeout 1800ms \
  --request-timeout 4500ms \
  --min-pass 2 \
  --max-latency 4500ms \
  --stability-retries 2 \
  --stability-gap 2s \
  --adaptive-min-pass 1 \
  --adaptive-max-latency 12s \
  --adaptive-no-stability \
  --targets-profile resilient \
  --targets-file configs/targets.json \
  --out-jsonl data/check/latest.jsonl \
  --out-txt data/check/healthy.txt \
  --out-report data/check/report.json \
  --adaptive-out-jsonl data/check/latest_adaptive.jsonl \
  --adaptive-out-txt data/check/healthy_adaptive.txt \
  --adaptive-out-report data/check/report_adaptive.json

Strict vs Adaptive

  • Strict mode validates with stronger requirements (default trust-first profile).
  • If strict healthy count is zero, ProxyAtlas automatically runs adaptive fallback unless strict already succeeded.
  • Adaptive outputs are separated to avoid mixing confidence levels.

Tuning

  • More speed: lower --max-eval, reduce --stability-retries, reduce timeouts.
  • More confidence: raise --min-pass, keep stability retries, lower --max-latency.
  • Better source quality: keep --source-fail-threshold low and --source-cooldown high.

Output Schema

Harvest JSONL (data/harvest/latest.jsonl)

proxy_url, scheme, host, port, sources[], source_hits, source_score, prefilter_ok, prefilter_pass_count, prefilter_checks_total, prefilter_reason, timestamps.

Check JSONL (data/check/latest.jsonl)

proxy_url, mode, attempt_rounds, pass_count, checks_total, avg_latency_ms, p95_latency_ms, success_targets[], failed_targets[], exit_ip, local_ip, anonymous, header_leaks[], stable, score, tier, status, status_reason, rejection_stage.

Troubleshooting

Symptom Meaning Action
healthy_count = 0 in strict report Free proxies are mostly dead/unstable under strict rules Check adaptive report and tune strict thresholds
status_reason = insufficient_pass dominates Proxies fail enough targets Increase source refresh, lower strict threshold for fallback
status_reason = target_errors dominates Connectivity/timeouts to targets Increase request/connect timeout and verify network path
many sources skipped by cooldown Source health gating is working wait for cooldown or adjust --source-cooldown

Benchmarks

Observed in local runs (Windows, Go 1.25):

  • Harvest: ~15k+ lines, ~20-60s depending on source/API responsiveness.
  • Strict check: 200 proxies typically completes in ~15-30s with bounded workers.

Limitations Of Free Proxies

  • Public free proxies are volatile and frequently dead.
  • Regional routing and target rate limits can heavily affect pass rates.
  • Zero strict healthy proxies is a normal outcome in many windows.

Ethical / Legal Notice

Use this project only for authorized testing and lawful automation. You are responsible for compliance with local law, provider terms, and target system policies.

Release Process

See docs/release-checklist.md.

Release artifacts are published as proxyatlas-<os>-<arch>.tar.gz with matching proxyatlas-<os>-<arch>.sha256 files, plus an aggregate checksums.txt. Validate downloaded release files before use (sha256sum -c checksums.txt).

License

MIT

About

High-performance Go CLI toolkit for harvesting, validating, and ranking free proxies with strict checks and adaptive fallback.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors