Skip to content
/ neer Public

A comprehensive web application that scrapes affordable housing information from Japanese real estate websites, translates the content into English, and provides a user-friendly interface for searching and viewing properties.

Notifications You must be signed in to change notification settings

mylex/neer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Japanese Real Estate Scraper

A comprehensive web application that scrapes affordable housing information from Japanese real estate websites, translates the content into English, and provides a user-friendly interface for searching and viewing properties.

πŸ—οΈ Project Architecture

japanese-real-estate-scraper/
β”œβ”€β”€ backend/                    # Node.js/Express API server
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ config/            # Configuration management
β”‚   β”‚   β”œβ”€β”€ database/          # Database models, repositories, migrations
β”‚   β”‚   β”œβ”€β”€ models/            # Data models and types
β”‚   β”‚   β”œβ”€β”€ services/          # Business logic services
β”‚   β”‚   β”‚   β”œβ”€β”€ scraper/       # Web scraping services
β”‚   β”‚   β”‚   β”œβ”€β”€ translation/   # Translation services
β”‚   β”‚   β”‚   β”œβ”€β”€ cache/         # Caching services
β”‚   β”‚   β”‚   β”œβ”€β”€ scheduler/     # Task scheduling
β”‚   β”‚   β”‚   └── monitoring/    # System monitoring
β”‚   β”‚   β”œβ”€β”€ routes/            # API routes
β”‚   β”‚   β”œβ”€β”€ middleware/        # Express middleware
β”‚   β”‚   └── utils/             # Utility functions
β”‚   β”œβ”€β”€ logs/                  # Application logs
β”‚   └── __tests__/             # Test files
β”œβ”€β”€ frontend/                   # React web application
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/        # React components
β”‚   β”‚   β”œβ”€β”€ pages/             # Page components
β”‚   β”‚   β”œβ”€β”€ services/          # API service layer
β”‚   β”‚   β”œβ”€β”€ hooks/             # Custom React hooks
β”‚   β”‚   β”œβ”€β”€ utils/             # Utility functions
β”‚   β”‚   └── __tests__/         # Test files
β”‚   β”œβ”€β”€ public/                # Static assets
β”‚   └── scripts/               # Build optimization scripts
β”œβ”€β”€ shared/                     # Shared utilities and types
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ types/             # TypeScript type definitions
β”‚   β”‚   └── utils/             # Shared utility functions
β”œβ”€β”€ database/                   # Database schema and migrations
β”œβ”€β”€ scripts/                    # Deployment and utility scripts
β”œβ”€β”€ nginx/                      # Nginx configuration
β”œβ”€β”€ docker-compose.yml          # Docker services configuration
└── package.json               # Root workspace configuration

✨ Features

Core Functionality

  • πŸ•·οΈ Web Scraping: Automated scraping of Japanese real estate websites (Suumo, Homes, etc.)
  • 🌐 Translation: Japanese-to-English translation with Google Translate API
  • πŸ—„οΈ Database Storage: Structured storage with PostgreSQL and Redis caching
  • πŸ” Advanced Search: Full-text search with filters (price, location, size, type)
  • πŸ“± Responsive UI: Mobile-friendly React interface with Material-UI
  • ⏰ Scheduled Updates: Automated scraping with cron jobs
  • πŸ“Š Monitoring: Comprehensive logging, metrics, and health checks

Technical Features

  • πŸ”„ Error Handling: Graceful degradation and retry mechanisms
  • ⚑ Performance: Optimized with caching, lazy loading, and pagination
  • πŸ§ͺ Testing: Comprehensive unit, integration, and E2E tests
  • πŸ”’ Security: Input validation, rate limiting, and CORS protection
  • πŸ“ˆ Scalability: Microservices architecture with Docker containers

πŸ› οΈ Technology Stack

Backend

  • Node.js 18+ with Express.js for API services
  • TypeScript for type safety and better development experience
  • PostgreSQL 15+ for primary data storage
  • Redis 7+ for caching and session management
  • Puppeteer for web scraping with stealth plugins
  • Google Translate API for Japanese-English translation
  • Winston for structured logging
  • Jest for testing with Supertest for API testing

Frontend

  • React 18 with TypeScript for type-safe UI development
  • Material-UI (MUI) for consistent component library
  • React Query for efficient data fetching and caching
  • React Router for client-side navigation
  • Axios for HTTP requests with interceptors
  • Jest & React Testing Library for component testing

DevOps & Infrastructure

  • Docker for containerization
  • Docker Compose for local development orchestration
  • Nginx for reverse proxy and static file serving
  • ESLint & Prettier for code quality and formatting
  • GitHub Actions for CI/CD (when configured)

πŸš€ Quick Start

Prerequisites

  • Node.js 18+ and npm 8+
  • Docker and Docker Compose (recommended)
  • PostgreSQL 15+ and Redis 7+ (if not using Docker)
  • Google Translate API key (for translation features)

1. Clone and Install

# Clone the repository
git clone <repository-url>
cd japanese-real-estate-scraper

# Install all dependencies
npm run install:all

2. Environment Setup

# Backend environment
cp backend/.env.example backend/.env.development
# Edit backend/.env.development with your configuration

# Frontend environment  
cp frontend/.env.example frontend/.env.development
# Edit frontend/.env.development with your configuration

Required Environment Variables:

# Backend (.env.development)
DATABASE_URL=postgresql://username:password@localhost:5432/japanese_real_estate_scraper
REDIS_URL=redis://localhost:6379
GOOGLE_TRANSLATE_API_KEY=your_google_translate_api_key
PORT=3001

# Frontend (.env.development)
REACT_APP_API_BASE_URL=http://localhost:3001

3. Database Setup

Option A: Using Docker (Recommended)

# Start database and cache services
docker-compose up -d postgres redis

# Run database migrations
npm run migrate:backend

Option B: Local Services

# Make sure PostgreSQL and Redis are running locally
# Create database
createdb japanese_real_estate_scraper

# Run migrations
npm run migrate:backend

4. Start Development Servers

# Start both frontend and backend in development mode
npm run dev

# Or start individually:
npm run dev:backend  # Backend only (http://localhost:3001)
npm run dev:frontend # Frontend only (http://localhost:3000)

5. Access the Application

πŸ“‹ Available Scripts

Root Level Commands

# Development
npm run dev                 # Start both frontend and backend
npm run dev:backend        # Start backend only
npm run dev:frontend       # Start frontend only

# Installation
npm run install:all        # Install all dependencies
npm run install:backend    # Install backend dependencies
npm run install:frontend   # Install frontend dependencies

# Building
npm run build              # Build all projects
npm run build:backend      # Build backend only
npm run build:frontend     # Build frontend only

# Testing
npm run test               # Run all tests
npm run test:backend       # Run backend tests
npm run test:frontend      # Run frontend tests

# Code Quality
npm run lint               # Lint all projects
npm run lint:fix           # Fix linting issues
npm run clean              # Clean build artifacts

Backend Specific Commands

cd backend

npm run dev                # Start development server
npm run build              # Build TypeScript
npm run start              # Start production server
npm run test               # Run tests
npm run test:watch         # Run tests in watch mode
npm run migrate:run        # Run database migrations
npm run migrate:status     # Check migration status
npm run db:init            # Initialize database

Frontend Specific Commands

cd frontend

npm start                  # Start development server
npm run build              # Build for production
npm run build:optimized    # Build with optimizations
npm run test               # Run tests
npm run analyze            # Analyze bundle size

🐳 Docker Development

Full Stack with Docker Compose

# Start all services (database, cache, backend, frontend)
docker-compose up

# Start only infrastructure services
docker-compose up postgres redis

# Build and start with fresh containers
docker-compose up --build

# Run in background
docker-compose up -d

# View logs
docker-compose logs -f

# Stop all services
docker-compose down

Individual Service Management

# Backend only
docker-compose up backend

# Frontend only  
docker-compose up frontend

# Database only
docker-compose up postgres redis

πŸ§ͺ Testing

Running Tests

# Run all tests
npm run test

# Run specific test suites
npm run test:backend       # Backend unit & integration tests
npm run test:frontend      # Frontend component tests

# Run with coverage
cd backend && npm run test -- --coverage
cd frontend && npm run test -- --coverage --watchAll=false

Test Types

  • Unit Tests: Individual component/function testing
  • Integration Tests: API endpoint and service integration
  • E2E Tests: Complete user workflow testing
  • Performance Tests: Load and stress testing

Custom Test Scripts

# Performance testing
node scripts/performance-test.js

# System integration validation
node scripts/validate-system.js

# Integration testing
node scripts/integration-test.js

πŸ“Š Monitoring & Debugging

Health Checks

Logging

  • Backend Logs: backend/logs/
  • Error Logs: backend/logs/error.log
  • Access Logs: Console output in development

Performance Monitoring

# Run performance tests
node scripts/performance-test.js

# Analyze frontend bundle
cd frontend && npm run analyze

# Monitor backend metrics
curl http://localhost:3001/api/metrics

πŸ”§ Configuration

Environment Files

  • Development: .env.development
  • Production: .env.production
  • Testing: .env.test
  • Examples: .env.example

Key Configuration Options

Backend Configuration:

  • Database connection settings
  • Redis cache configuration
  • Translation API keys
  • Scraping parameters (delays, concurrency)
  • Logging levels and file paths
  • Scheduler settings (cron expressions)
  • Security settings (CORS, rate limiting)

Frontend Configuration:

  • API base URL
  • Feature flags
  • UI settings (pagination, search limits)
  • External service keys (Google Maps)

πŸš€ Deployment

Production Build

# Build all projects for production
npm run build

# Build optimized frontend
cd frontend && npm run build:optimized

Docker Production

# Build production images
docker-compose -f docker-compose.prod.yml build

# Start production stack
docker-compose -f docker-compose.prod.yml up -d

Environment Setup

# Copy production environment template
cp .env.production.template .env.production
# Edit with production values

# Run database migrations
NODE_ENV=production npm run migrate:backend

For detailed deployment instructions, see DEPLOYMENT.md.

πŸ“ˆ Project Status

βœ… Completed Features

  • Project structure and configuration
  • Database schema and migrations
  • Basic API endpoints (properties, search, health)
  • Frontend components and routing
  • Web scraping infrastructure
  • Translation services
  • Caching layer with Redis
  • Error handling and monitoring
  • Comprehensive testing suite
  • Docker containerization
  • Performance optimizations

🚧 In Progress

  • Production deployment automation
  • Advanced search features
  • User authentication (if required)
  • Real-time notifications
  • Analytics dashboard

πŸ“‹ Task Tracking

Detailed task progress is tracked in .kiro/specs/japanese-real-estate-scraper/tasks.md

🀝 Contributing

Development Workflow

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Follow the existing code style and linting rules
  4. Write tests for new functionality
  5. Update documentation as needed
  6. Commit changes (git commit -m 'Add amazing feature')
  7. Push to branch (git push origin feature/amazing-feature)
  8. Open a Pull Request

Code Standards

  • TypeScript for type safety
  • ESLint for code quality
  • Prettier for formatting
  • Jest for testing
  • Conventional Commits for commit messages

Testing Requirements

  • Unit tests for new functions/components
  • Integration tests for API endpoints
  • E2E tests for user workflows
  • Maintain >80% code coverage

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

Quick Setup (Recommended)

# Automated setup script
npm run setup:dev

# Or manual setup
npm run install:all
cp backend/.env.example backend/.env.development
cp frontend/.env.example frontend/.env.development
# Edit environment files, then:
npm run docker:up postgres redis
npm run migrate:backend
npm run dev

Health Checks & Diagnostics

# Check system health
npm run health

# Run integration tests
npm run test:integration

# Validate complete system
npm run test:system

Common Issues

For detailed troubleshooting, see TROUBLESHOOTING.md

  1. Port conflicts: Change ports in environment files
  2. Database connection: Run npm run health to diagnose
  3. Translation API: Verify Google Translate API key is valid
  4. Docker issues: Ensure Docker Desktop is running

Getting Help

  • Run diagnostics: npm run health
  • Check troubleshooting guide: TROUBLESHOOTING.md
  • Review logs in backend/logs/ for error details
  • Check the Issues page for known problems

Performance Optimization

  • Enable Redis caching for better response times
  • Use Docker for consistent development environment
  • Monitor metrics at /api/metrics endpoint
  • Run performance tests regularly

Built with ❀️ for the Japanese real estate market

About

A comprehensive web application that scrapes affordable housing information from Japanese real estate websites, translates the content into English, and provides a user-friendly interface for searching and viewing properties.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published