openfoodfacts · Copilot · Aug 26, 2025 · Aug 26, 2025 · Aug 26, 2025
@@ -0,0 +1,28 @@
+# GSoC 2018 Projects
+
+Google Summer of Code 2018 projects for OpenFoodFacts AI.
+
+## Projects
+
+### Table Detection
+Object detection system for detecting tables in food packaging images.
+
+## Installation
+
+Install the required dependencies:
+
+```bash
+pip install -r requirements.txt
+```
+
+## Usage
+
+See the individual project directories for specific usage instructions:
+
+- `table_detection/` - Table detection model and utilities
+- `GSoC2018_poc/` - Proof of concept implementations
+
+## Files
+
+- `resize.py` - Image resizing utilities
+- `table_detection/utils/` - Visualization and label utilities for object detection
@@ -0,0 +1,5 @@
+tensorflow>=2.0.0
+opencv-python
+numpy
+matplotlib
+Pillow
@@ -0,0 +1,7 @@
+pandas>=1.3.0
+numpy>=1.20.0
+matplotlib>=3.0.0
+seaborn>=0.11.0
+scikit-learn>=1.0.0
+xgboost>=1.5.0
+jupyter>=1.0.0
@@ -0,0 +1,40 @@
+# Circular Model
+
+Machine learning model for circular detection in product images, with barcode generation capabilities.
+
+## Features
+
+- Circular pattern detection in product images
+- Barcode generation and processing
+- Image dataset downloading from OpenFoodFacts
+- Jupyter notebook with model training pipeline
+
+## Installation
+
+Install the required dependencies:
+
+```bash
+pip install -r requirements.txt
+```
+
+## Usage
+
+### Download Images
+```bash
+python download_images.py
+```
+
+### Generate Barcodes
+```bash
+python generate_barcode.py
+```
+
+### Model Training
+Open and run the `circular_model.ipynb` notebook for model training and evaluation.
+
+## Files
+
+- `circular_model.ipynb` - Main Jupyter notebook with model implementation
+- `download_images.py` - Script to download images from OpenFoodFacts
+- `generate_barcode.py` - Barcode generation utilities
+- `images/` - Directory for storing downloaded images
@@ -0,0 +1,7 @@
+requests>=2.25.0
+jupyter
+matplotlib
+numpy
+pandas
+tensorflow>=2.0.0
+Pillow
@@ -0,0 +1,29 @@
+# Data Quality
+
+Tools and scripts for analyzing and improving data quality in the OpenFoodFacts database.
+
+## Features
+
+- Language switching for ingredient lists
+- Data quality analysis and reporting
+- Database consistency checks
+- Automated data cleaning utilities
+
+## Installation
+
+Install the required dependencies:
+
+```bash
+pip install -r requirements.txt
+```
+
+## Usage
+
+### Switch Ingredient Language
+```bash
+python switch_ingredient_lang.py
+```
+
+## Files
+
+- `switch_ingredient_lang.py` - Script to switch ingredient language codes and analyze data quality
@@ -0,0 +1,6 @@
+openfoodfacts>=0.2.0
+requests>=2.25.0
+typer>=0.12.0
+tqdm>=4.60.0
+redis>=4.0.0
+backoff>=2.0.0
@@ -0,0 +1,39 @@
+# Front Image Classification
+
+Image classification system for categorizing front-facing product images using machine learning.
+
+## Features
+
+- Product front image classification
+- Training pipeline with data augmentation
+- CLI interface for training and inference
+- Integration with OpenFoodFacts database
+- Support for multiple ML backends
+
+## Installation
+
+This project uses Python script dependencies (PEP 723). The dependencies are defined inline in the script files.
+
+For manual installation:
+
+```bash
+pip install typer tqdm Pillow ultralytics albumentations opencv-python numpy openfoodfacts duckdb torch
+```
+
+## Usage
+
+### Training
+```bash
+python train.py
+```
+
+### CLI Interface
+```bash
+python cli.py --help
+```
+
+## Files
+
+- `train.py` - Main training script with inline dependencies
+- `cli.py` - Command-line interface for the classifier
+- `ml_commons.py` - Common ML utilities and data transformations
@@ -0,0 +1,14 @@
+# Front Image Classification uses PEP 723 inline script dependencies
+# Dependencies are defined directly in the Python script files
+
+# For manual installation, install these packages:
+typer>=0.12.0
+tqdm>=4.60.0  
+Pillow>=8.0.0
+ultralytics>=8.0.0
+albumentations>=1.0.0
+opencv-python>=4.5.0
+numpy>=1.20.0
+openfoodfacts>=0.2.0
+duckdb>=0.8.0
+torch>=1.12.0
@@ -0,0 +1,36 @@
+# GenAI Features
+
+Generative AI experiments and feature development for OpenFoodFacts data analysis.
+
+## Features
+
+- Analysis of recent changes in OpenFoodFacts data
+- Jupyter notebooks for data exploration
+- Generative AI model experiments
+- Data pipeline prototypes
+
+## Installation
+
+Install the required dependencies:
+
+```bash
+pip install -r requirements.txt
+```
+
+## Usage
+
+### Explore Recent Changes
+Open and run the notebook:
+```bash
+jupyter notebook notebooks/explore_recent_changes.ipynb
+```
+
+## Structure
+
+- `notebooks/` - Jupyter notebooks for data exploration and analysis
+- `dataset/` - Dataset-related files and configurations
+- `prompts/` - Prompt templates and configurations for generative AI models
+
+## Files
+
+- `dataset/recent_changes.txt` - Configuration for recent changes data source
@@ -0,0 +1,7 @@
+jupyter>=1.0.0
+pandas>=1.3.0
+numpy>=1.20.0
+matplotlib>=3.0.0
+requests>=2.25.0
+openfoodfacts>=0.2.0
+transformers>=4.0.0
@@ -0,0 +1,57 @@
+# Ingredient Extraction
+
+Machine learning models and tools for extracting structured ingredient information from product text.
+
+## Features
+
+- Dataset generation for ingredient extraction tasks
+- Model training and fine-tuning pipelines
+- LayoutLM-based document understanding
+- Model analysis and evaluation tools
+- Streamlit demo interface
+
+## Installation
+
+Install the required dependencies:
+
+```bash
+pip install -r requirements.txt
+```
+
+## Usage
+
+### Dataset Generation
+```bash
+cd dataset-generation
+python generate_dataset.py
+```
+
+### Model Training
+```bash
+cd train
+python train_model.py
+```
+
+### LayoutLM Training
+```bash
+cd train-layoutlm
+python train_layoutlm.py
+```
+
+### Model Analysis
+```bash
+cd model-analysis
+python evaluate_model.py
+```
+
+### Demo
+```bash
+streamlit run model-analysis/streamlit_demo.py
+```
+
+## Structure
+
+- `dataset-generation/` - Scripts for creating training datasets
+- `train/` - Standard model training pipeline
+- `train-layoutlm/` - LayoutLM-specific training code
+- `model-analysis/` - Model evaluation and analysis tools
@@ -0,0 +1,11 @@
+transformers>=4.20.0
+torch>=1.12.0
+datasets>=2.0.0
+streamlit>=1.20.0
+layoutlm>=0.1.0
+openfoodfacts>=0.2.0
+pandas>=1.3.0
+numpy>=1.20.0
+matplotlib>=3.0.0
+scikit-learn>=1.0.0
+tqdm>=4.60.0
@@ -0,0 +1,52 @@
+# Language Identification
+
+Machine learning models for automatic language identification in product text data.
+
+## Features
+
+- Language detection for product ingredients and descriptions
+- Training pipelines for language classification models
+- Data extraction and preprocessing scripts
+- Model evaluation and metrics calculation
+- Inference utilities for production use
+
+## Installation
+
+This project uses [Poetry](https://python-poetry.org/) for dependency management.
+
+```bash
+poetry install
+```
+
+Or install with pip:
+```bash
+pip install -r requirements.txt
+```
+
+## Usage
+
+### Extract Data
+```bash
+poetry run python scripts/01_extract_data.py
+```
+
+### Calculate Metrics
+```bash
+poetry run python scripts/03_calculate_metrics.py
+```
+
+### Run Inference
+```bash
+poetry run python scripts/inference.py
+```
+
+## Project Structure
+
+- `scripts/` - Data processing and model training scripts
+  - `01_extract_data.py` - Data extraction from OpenFoodFacts
+  - `03_calculate_metrics.py` - Model evaluation metrics
+  - `inference.py` - Model inference utilities
+
+## Dependencies
+
+This project uses Poetry for dependency management. See `pyproject.toml` for the complete list of dependencies.