POOPAK | TOR Hidden Service Crawler
An experimental application for crawling, scanning, and gathering data from TOR hidden services.
- Multi-level in-depth crawling using CURL
- Link extraction and email/BTC/ETH/XMR address detection
- EXIF metadata extraction
- Screenshot capture (using Splash)
- Subject detection (using Spacy)
- Port scanning
- Report generation (CSV/PDF)
- Full-text search with Elasticsearch
- Language detection
- Docker-based deployment with web UI
make dev-up # Start all services
make dev-logs # View logsAccess at: http://localhost
cp .env.example .env # Configure passwords
make prod-up # Start all services
make health # Check status- nginx: Web server & reverse proxy
- web-app: Flask application
- mongodb: Database
- redis: Message queue
- elasticsearch: Search engine
- torpool: Tor proxy pool
- splash: Screenshot service
- spacy: NLP service
- workers: Background processing (crawler, detector, app, panel)
- Separate dev/prod Docker configurations
- Health checks for all services
- Network isolation in production
- Automatic service dependencies
- Hot-reload in development
# Development
make dev-up # Start
make dev-logs # View logs
make dev-shell # Open shell
make dev-down # Stop
# Production
make prod-up # Start
make prod-logs # View logs
make health # Check health
make backup # Backup databases
make prod-down # Stop
# Maintenance
make reindex # Reindex Elasticsearch
make stats # Resource usageCreate .env file:
# Database
MONGO_ROOT_USER=admin
MONGO_ROOT_PASSWORD=your-secure-password
# Redis
REDIS_PASSWORD=your-secure-password
# Application
SECRET_KEY=your-random-secret-key
FLASK_ENV=production
# Elasticsearch
ELASTICSEARCH_ENABLED=true
ELASTICSEARCH_HOSTS=http://elasticsearch:9200
# Error Tracking (Optional)
ERROR_TRACKING_ENABLED=true
SENTRY_DSN=your-sentry-dsn- ✅ Separate dev/prod configurations
- ✅ Health checks for all services
- ✅ Multi-stage Dockerfile
- ✅ Network isolation (production)
- ✅ Non-root containers
- ✅ Optimized caching
- ✅ Dependency injection for web views
- ✅ Repository pattern for data access
- ✅ Service layer for business logic
- ✅ Comprehensive error handling
- ✅ Production error tracking (Sentry)
- ✅ Elasticsearch integration
- ✅ Type hints throughout
- ✅ Structured logging
- ✅ Custom exception hierarchy
- ✅ Consistent error responses
- ✅ User-friendly error pages
- Docker 20.10+
- Docker Compose 2.0+
- 4GB RAM minimum
- 10GB disk space
# Clone repository
git clone <repository-url>
cd poopak
# Start development environment
make dev-up
# Verify services
make health
# Access application
open http://localhostCheck service health:
make healthOr visit: http://localhost:8000/health
Response:
{
"status": "healthy",
"service": "onion-crawler",
"components": {
"mongodb": "healthy",
"redis": "healthy",
"elasticsearch": "healthy"
}
}make dev-logs # Check logs
make health # Check healthlsof -i :80 # Find what's using port 80make dev-down
make dev-updocker exec onion-mongodb-dev mongosh --eval "db.adminCommand('ping')"application/
├── config/ # Configuration
├── crawler/ # Crawler components
├── infrastructure/ # Database, queue, logging
├── models/ # Data models
├── repositories/ # Data access layer
├── services/ # Business logic
├── utils/ # Utilities
└── web/ # Flask application
├── auth/ # Authentication
├── dashboard/ # Dashboard views
├── scanner/ # Scanner views
├── search/ # Search views
└── templates/ # HTML templates
# Run tests
docker exec onion-web-app-dev python -m pytest
# Check syntax
python -m py_compile application/**/*.py- No passwords required
- All ports exposed
- Debug mode enabled
- Passwords required (set in
.env) - Minimal port exposure (only nginx)
- Network isolation
- Non-root containers
- Structured logging
# Backup
make backup
# Restore
make restore BACKUP=backups/mongodb-20240101-120000# Reindex documents
make reindex
# Check cluster health
curl http://localhost:9200/_cluster/healthmake statsThis software is made available under the GPL v.3 license. If you run a modified program on a server and let other users communicate with it, your server must also allow them to download the source code.
Looking for open-source developers to work together on PoopakV2. Interested? Contact: yolato@protonmail.com
- Check logs:
make dev-logs - Check health:
make health - Verify setup:
./validate-docker-setup.sh
Ready to start? Run make dev-up and visit http://localhost