Skip to content

grey-box/symmetry-project

Repository files navigation

Grey-box Logo

Project Symmetry - Cross-Language Wikipedia Article Gap Analysis Tool

Project-Symmetry: Cross-Language Wikipedia Article Semantic Analysis Tool

A modern semantic translator tool designed to translate, compare, and evaluate the semantic similarity of Wikipedia content across different languages

πŸš€ Quick Start

Prerequisites

Installation

Backend

# Navigate to backend
cd symmetry-unified-backend

# Quick start (recommended)
./start.sh

# This will:
# 1. Create virtual environment if needed
# 2. Install dependencies
# 3. Start server at http://127.0.0.1:8000

Access interactive API documentation at: http://127.0.0.1:8000/docs

Frontend

# Navigate to frontend
cd desktop-electron-frontend

# Install dependencies
yarn install

# Run development
yarn start

πŸ“– Project Overview

Project Symmetry uses AI to accelerate Wikipedia's translation efforts in less-represented languages (< 1M articles) by analyzing semantic gaps between articles in different languages and providing targeted translations.

The application helps identify critical information lost or added during translation, useful for scenarios without internet access, such as medical documents, government communications, and NGO materials.

Currently focused on Wikipedia content; future expansion to other internet content and AI-powered translation for underrepresented languages.

πŸ“Š Features

  • 🌍 Wikipedia Translation: Translate articles between languages
  • πŸ” Semantic Comparison: Identify gaps and additions in translations using AI models
  • πŸ“Š Gap Analysis: Detect missing/extra information with color-coded results
  • 🎯 Language Support: Focus on underrepresented languages
  • πŸ“ Structured Articles: Section-by-section content with citations and references
  • πŸ€– AI-Powered: LLM-based semantic understanding with models like LaBSE, XLM-RoBERTa
  • πŸ“ˆ Analytics: Translation quality metrics and structural analysis
  • πŸ§ͺ Testing: Comprehensive test suite with 97% coverage

πŸ—οΈ Project Structure

symmetry-project-202512/
β”œβ”€β”€ symmetry-unified-backend/   # FastAPI backend
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ ai/               # AI and ML components
β”‚   β”‚   β”‚   β”œβ”€β”€ semantic_comparison.py
β”‚   β”‚   β”‚   β”œβ”€β”€ llm_comparison.py
β”‚   β”‚   β”‚   └── translations.py
β”‚   β”‚   β”œβ”€β”€ models/           # Pydantic v2 models
β”‚   β”‚   β”œβ”€β”€ routers/          # API route handlers
β”‚   β”‚   β”‚   β”œβ”€β”€ wiki_articles.py
β”‚   β”‚   β”‚   β”œβ”€β”€ structured_wiki.py
β”‚   β”‚   β”‚   β”œβ”€β”€ comparison.py
β”‚   β”‚   β”‚   └── structural_analysis.py
β”‚   β”‚   β”œβ”€β”€ services/         # Business logic
β”‚   β”‚   β”‚   β”œβ”€β”€ article_parser.py
β”‚   β”‚   β”‚   β”œβ”€β”€ cache.py
β”‚   β”‚   β”‚   └── wiki_utils.py
β”‚   β”‚   β”œβ”€β”€ prompts/          # LLM prompts
β”‚   β”‚   └── main.py
β”‚   β”œβ”€β”€ tests/                # Test suite
β”‚   └── requirements.txt
β”œβ”€β”€ desktop-electron-frontend/ # Electron + React frontend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/       # React components
β”‚   β”‚   β”œβ”€β”€ services/         # API services
β”‚   β”‚   β”œβ”€β”€ models/           # TypeScript interfaces
β”‚   β”‚   β”œβ”€β”€ constants/        # Application constants
β”‚   β”‚   β”œβ”€β”€ context/          # React context
β”‚   β”‚   └── pages/            # Page components
β”‚   └── package.json
└── README.md

πŸ”§ Installation Guide

System Requirements

  • Operating System: Windows 10+, Ubuntu 20.04+, or macOS 10.15+
  • Memory: Minimum 4GB RAM (8GB recommended)
  • Storage: Minimum 2GB free space
  • Internet Connection: Required for downloading dependencies

Software Requirements

  • Node.js: Version 18.0 or higher (LTS recommended)
  • Python: Version 3.8 - 3.11 (NLP library requirements prevent 3.12)
  • npm: Version 8.0 or higher (comes with Node.js)
  • Git: Latest version

Manual Installation

Backend

cd symmetry-unified-backend

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.template .env
# Edit .env as needed (LOG_LEVEL, FASTAPI_DEBUG)

Frontend

cd desktop-electron-frontend

# Install dependencies
yarn install

# Verify installation
yarn list --depth=0

Running the Application

Development Mode

# Backend (with hot reload)
cd symmetry-unified-backend
source venv/bin/activate
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000

# Frontend
cd desktop-electron-frontend
yarn start

Production Mode

# Backend
cd symmetry-unified-backend
source venv/bin/activate
uvicorn app.main:app --host 127.0.0.1 --port 8000 --workers 4

# Frontend - package for distribution
cd desktop-electron-frontend
yarn package

🀝 Contributing

Thank you for your interest in contributing to Project Symmetry!

Getting Started

  1. Fork the repository
  2. Create a feature branch
  3. Install dependencies (see Installation Guide)
  4. Make your changes
  5. Run tests
  6. Submit a pull request

Code Standards

Python Backend

  • Use PEP 8 style guidelines (88 char max line length)
  • 4 spaces for indentation
  • Type hints for all functions
  • Docstrings for public functions
  • PEP 8 naming conventions (snake_case)

JavaScript/TypeScript Frontend

  • ESLint and Prettier configuration
  • 2 spaces for indentation
  • Components: PascalCase
  • Services/Utilities: camelCase
  • Types: PascalCase

Testing

Backend

cd symmetry-unified-backend
source venv/bin/activate
pytest

Test coverage: 60 tests, 97% pass rate

Frontend

cd desktop-electron-frontend
yarn test

Development Workflow

# Always pull latest changes
git fetch upstream
git rebase upstream/main

# Create feature branch
git checkout -b feature/your-feature-name

# Make changes and test
cd symmetry-unified-backend && pytest
cd ../desktop-electron-frontend && yarn test

# Commit and push
git add .
git commit -m "feat: description of changes"
git push origin feature/your-feature-name

# Create pull request
gh pr create --title "Feature Title" --body "Description..."

Pull Request Checklist

  • Code follows project standards
  • Tests pass (backend and frontend)
  • Documentation updated
  • No breaking changes (unless documented)
  • Commit messages follow conventional format

πŸ“š Documentation

Main Documentation

API Documentation

Interactive API documentation available at http://127.0.0.1:8000/docs when backend is running.

Key API Endpoints

Wiki Articles

  • GET /symmetry/v1/wiki/articles - Fetch Wikipedia article
  • GET /symmetry/v1/wiki/structured-article - Get structured article with sections, citations, references
  • GET /symmetry/v1/wiki/structured-section - Get specific section with metadata
  • GET /symmetry/v1/wiki/citation-analysis - Analyze citations
  • GET /symmetry/v1/wiki/reference-analysis - Analyze references
  • GET /wiki_translate/source_article - Get translated article

Comparison

  • POST /symmetry/v1/articles/compare - Compare two articles (traditional semantic)
  • GET /symmetry/v1/comparison/llm - LLM comparison (GET)
  • POST /symmetry/v1/comparison/llm - LLM comparison (POST)
  • GET /symmetry/v1/comparison/semantic - Semantic comparison (GET)
  • POST /symmetry/v1/comparison/semantic - Semantic comparison (POST)

Structural Analysis

  • GET /operations/{source_language}/{title} - Analyze article across 6 languages with quality scoring

πŸ§ͺ Testing

Backend Testing

Run all tests:

cd symmetry-unified-backend
pytest

Run specific test file:

pytest tests/test_wiki_articles.py

Run with coverage:

pytest --cov=app --cov-report=html

Run with verbose output:

pytest -v

Test Coverage

  • Total Tests: 60
  • Passing: 58 (97%)
  • Test Categories: Wiki articles, comparison, structured wiki, structural analysis
  • Test Time: ~0.07s

Test Data

Sample article texts in tests/data/:

  • obama_A.txt and obama_B.txt: For comparison tests
  • missingno_en.txt and missingno_fr.txt: Multi-language tests

πŸŽ“ Learning Resources

Prerequisites

Git and GitHub

Python Development

JavaScript/TypeScript Development

Recommended Learning Path

  1. Week 1: Set up development environment and understand project structure
  2. Week 2: Study the existing codebase and run the application
  3. Week 3: Make small contributions (documentation, bug fixes)
  4. Week 4: Work on a feature under mentorship
  5. Week 5+: Contribute independently and help others

πŸ” Areas for Contribution

High Priority

  • 🌍 Translation Improvements: Add support for more languages, improve accuracy
  • πŸ” Semantic Analysis: Enhance comparison algorithms, add more models
  • πŸ§ͺ Testing: Increase test coverage, add integration tests

Medium Priority

  • πŸ–₯️ UI/UX Improvements: Redesign components, add dark mode
  • ⚑ Performance: Optimize API responses, reduce memory usage
  • πŸ“š Documentation: Update API documentation, add tutorials

Low Priority

  • πŸ”§ DevOps: Set up CI/CD pipeline, add monitoring
  • 🎨 Design: Update icons and logos, improve visual consistency

πŸ› Troubleshooting

Backend Issues

"python3: command not found"

Edit start.sh and change python3 to python.

Permission denied on start.sh

chmod +x start.sh

Virtual environment issues

Rebuild from scratch:

deactivate
rm -rf venv/
./start.sh

Ollama not running for LLM comparison

ollama serve
# In another terminal:
ollama pull deepseek-r1

Frontend Issues

npm install fails

# Clear cache and reinstall
yarn cache clean
rm -rf node_modules
yarn install

Port already in use

# Find and kill process
lsof -ti:8000 | xargs kill -9
lsof -ti:3000 | xargs kill -9

Platform-Specific Issues

Windows

  • Add exception in Windows Defender
  • Allow connections through Windows Firewall
  • Ensure Python and Node.js are in PATH

Linux/macOS

  • Ensure OpenGL ES 2.0 or higher
  • Fix permissions: chmod +x start.sh

🀝 Community

πŸ“„ License

This project is licensed under the appropriate license. See the LICENSE file for details.

πŸ™ Acknowledgments

  • Grey Box: Project development and maintenance
  • Wikipedia: Source content and API access
  • Open Source Community: Libraries and tools

Last Updated: December 2025
Version: 1.0.0
Maintainers: grey-box

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published