A comprehensive web application that scrapes affordable housing information from Japanese real estate websites, translates the content into English, and provides a user-friendly interface for searching and viewing properties.
japanese-real-estate-scraper/
βββ backend/ # Node.js/Express API server
β βββ src/
β β βββ config/ # Configuration management
β β βββ database/ # Database models, repositories, migrations
β β βββ models/ # Data models and types
β β βββ services/ # Business logic services
β β β βββ scraper/ # Web scraping services
β β β βββ translation/ # Translation services
β β β βββ cache/ # Caching services
β β β βββ scheduler/ # Task scheduling
β β β βββ monitoring/ # System monitoring
β β βββ routes/ # API routes
β β βββ middleware/ # Express middleware
β β βββ utils/ # Utility functions
β βββ logs/ # Application logs
β βββ __tests__/ # Test files
βββ frontend/ # React web application
β βββ src/
β β βββ components/ # React components
β β βββ pages/ # Page components
β β βββ services/ # API service layer
β β βββ hooks/ # Custom React hooks
β β βββ utils/ # Utility functions
β β βββ __tests__/ # Test files
β βββ public/ # Static assets
β βββ scripts/ # Build optimization scripts
βββ shared/ # Shared utilities and types
β βββ src/
β β βββ types/ # TypeScript type definitions
β β βββ utils/ # Shared utility functions
βββ database/ # Database schema and migrations
βββ scripts/ # Deployment and utility scripts
βββ nginx/ # Nginx configuration
βββ docker-compose.yml # Docker services configuration
βββ package.json # Root workspace configuration
- π·οΈ Web Scraping: Automated scraping of Japanese real estate websites (Suumo, Homes, etc.)
- π Translation: Japanese-to-English translation with Google Translate API
- ποΈ Database Storage: Structured storage with PostgreSQL and Redis caching
- π Advanced Search: Full-text search with filters (price, location, size, type)
- π± Responsive UI: Mobile-friendly React interface with Material-UI
- β° Scheduled Updates: Automated scraping with cron jobs
- π Monitoring: Comprehensive logging, metrics, and health checks
- π Error Handling: Graceful degradation and retry mechanisms
- β‘ Performance: Optimized with caching, lazy loading, and pagination
- π§ͺ Testing: Comprehensive unit, integration, and E2E tests
- π Security: Input validation, rate limiting, and CORS protection
- π Scalability: Microservices architecture with Docker containers
- Node.js 18+ with Express.js for API services
- TypeScript for type safety and better development experience
- PostgreSQL 15+ for primary data storage
- Redis 7+ for caching and session management
- Puppeteer for web scraping with stealth plugins
- Google Translate API for Japanese-English translation
- Winston for structured logging
- Jest for testing with Supertest for API testing
- React 18 with TypeScript for type-safe UI development
- Material-UI (MUI) for consistent component library
- React Query for efficient data fetching and caching
- React Router for client-side navigation
- Axios for HTTP requests with interceptors
- Jest & React Testing Library for component testing
- Docker for containerization
- Docker Compose for local development orchestration
- Nginx for reverse proxy and static file serving
- ESLint & Prettier for code quality and formatting
- GitHub Actions for CI/CD (when configured)
- Node.js 18+ and npm 8+
- Docker and Docker Compose (recommended)
- PostgreSQL 15+ and Redis 7+ (if not using Docker)
- Google Translate API key (for translation features)
# Clone the repository
git clone <repository-url>
cd japanese-real-estate-scraper
# Install all dependencies
npm run install:all# Backend environment
cp backend/.env.example backend/.env.development
# Edit backend/.env.development with your configuration
# Frontend environment
cp frontend/.env.example frontend/.env.development
# Edit frontend/.env.development with your configurationRequired Environment Variables:
# Backend (.env.development)
DATABASE_URL=postgresql://username:password@localhost:5432/japanese_real_estate_scraper
REDIS_URL=redis://localhost:6379
GOOGLE_TRANSLATE_API_KEY=your_google_translate_api_key
PORT=3001
# Frontend (.env.development)
REACT_APP_API_BASE_URL=http://localhost:3001Option A: Using Docker (Recommended)
# Start database and cache services
docker-compose up -d postgres redis
# Run database migrations
npm run migrate:backendOption B: Local Services
# Make sure PostgreSQL and Redis are running locally
# Create database
createdb japanese_real_estate_scraper
# Run migrations
npm run migrate:backend# Start both frontend and backend in development mode
npm run dev
# Or start individually:
npm run dev:backend # Backend only (http://localhost:3001)
npm run dev:frontend # Frontend only (http://localhost:3000)- Frontend: http://localhost:3000
- Backend API: http://localhost:3001
- Health Check: http://localhost:3001/health
- API Metrics: http://localhost:3001/api/metrics
# Development
npm run dev # Start both frontend and backend
npm run dev:backend # Start backend only
npm run dev:frontend # Start frontend only
# Installation
npm run install:all # Install all dependencies
npm run install:backend # Install backend dependencies
npm run install:frontend # Install frontend dependencies
# Building
npm run build # Build all projects
npm run build:backend # Build backend only
npm run build:frontend # Build frontend only
# Testing
npm run test # Run all tests
npm run test:backend # Run backend tests
npm run test:frontend # Run frontend tests
# Code Quality
npm run lint # Lint all projects
npm run lint:fix # Fix linting issues
npm run clean # Clean build artifactscd backend
npm run dev # Start development server
npm run build # Build TypeScript
npm run start # Start production server
npm run test # Run tests
npm run test:watch # Run tests in watch mode
npm run migrate:run # Run database migrations
npm run migrate:status # Check migration status
npm run db:init # Initialize databasecd frontend
npm start # Start development server
npm run build # Build for production
npm run build:optimized # Build with optimizations
npm run test # Run tests
npm run analyze # Analyze bundle size# Start all services (database, cache, backend, frontend)
docker-compose up
# Start only infrastructure services
docker-compose up postgres redis
# Build and start with fresh containers
docker-compose up --build
# Run in background
docker-compose up -d
# View logs
docker-compose logs -f
# Stop all services
docker-compose down# Backend only
docker-compose up backend
# Frontend only
docker-compose up frontend
# Database only
docker-compose up postgres redis# Run all tests
npm run test
# Run specific test suites
npm run test:backend # Backend unit & integration tests
npm run test:frontend # Frontend component tests
# Run with coverage
cd backend && npm run test -- --coverage
cd frontend && npm run test -- --coverage --watchAll=false- Unit Tests: Individual component/function testing
- Integration Tests: API endpoint and service integration
- E2E Tests: Complete user workflow testing
- Performance Tests: Load and stress testing
# Performance testing
node scripts/performance-test.js
# System integration validation
node scripts/validate-system.js
# Integration testing
node scripts/integration-test.js- Backend Health: http://localhost:3001/health
- System Metrics: http://localhost:3001/api/metrics
- Database Status: Included in health check response
- Backend Logs:
backend/logs/ - Error Logs:
backend/logs/error.log - Access Logs: Console output in development
# Run performance tests
node scripts/performance-test.js
# Analyze frontend bundle
cd frontend && npm run analyze
# Monitor backend metrics
curl http://localhost:3001/api/metrics- Development:
.env.development - Production:
.env.production - Testing:
.env.test - Examples:
.env.example
Backend Configuration:
- Database connection settings
- Redis cache configuration
- Translation API keys
- Scraping parameters (delays, concurrency)
- Logging levels and file paths
- Scheduler settings (cron expressions)
- Security settings (CORS, rate limiting)
Frontend Configuration:
- API base URL
- Feature flags
- UI settings (pagination, search limits)
- External service keys (Google Maps)
# Build all projects for production
npm run build
# Build optimized frontend
cd frontend && npm run build:optimized# Build production images
docker-compose -f docker-compose.prod.yml build
# Start production stack
docker-compose -f docker-compose.prod.yml up -d# Copy production environment template
cp .env.production.template .env.production
# Edit with production values
# Run database migrations
NODE_ENV=production npm run migrate:backendFor detailed deployment instructions, see DEPLOYMENT.md.
- Project structure and configuration
- Database schema and migrations
- Basic API endpoints (properties, search, health)
- Frontend components and routing
- Web scraping infrastructure
- Translation services
- Caching layer with Redis
- Error handling and monitoring
- Comprehensive testing suite
- Docker containerization
- Performance optimizations
- Production deployment automation
- Advanced search features
- User authentication (if required)
- Real-time notifications
- Analytics dashboard
Detailed task progress is tracked in .kiro/specs/japanese-real-estate-scraper/tasks.md
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Follow the existing code style and linting rules
- Write tests for new functionality
- Update documentation as needed
- Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
- TypeScript for type safety
- ESLint for code quality
- Prettier for formatting
- Jest for testing
- Conventional Commits for commit messages
- Unit tests for new functions/components
- Integration tests for API endpoints
- E2E tests for user workflows
- Maintain >80% code coverage
This project is licensed under the MIT License - see the LICENSE file for details.
# Automated setup script
npm run setup:dev
# Or manual setup
npm run install:all
cp backend/.env.example backend/.env.development
cp frontend/.env.example frontend/.env.development
# Edit environment files, then:
npm run docker:up postgres redis
npm run migrate:backend
npm run dev# Check system health
npm run health
# Run integration tests
npm run test:integration
# Validate complete system
npm run test:systemFor detailed troubleshooting, see TROUBLESHOOTING.md
- Port conflicts: Change ports in environment files
- Database connection: Run
npm run healthto diagnose - Translation API: Verify Google Translate API key is valid
- Docker issues: Ensure Docker Desktop is running
- Run diagnostics:
npm run health - Check troubleshooting guide: TROUBLESHOOTING.md
- Review logs in
backend/logs/for error details - Check the Issues page for known problems
- Enable Redis caching for better response times
- Use Docker for consistent development environment
- Monitor metrics at
/api/metricsendpoint - Run performance tests regularly
Built with β€οΈ for the Japanese real estate market