This project has been containerized with Docker. The Docker image is configured to run the product matching system with the following command:
python3 main.py --client dataset/competitor_a_products.csv --competitor dataset/competitor_b_products.csv --output output/result.json --min-confidence 0.5 --sample-size 50Dockerfile- Docker image definitionrequirements.txt- Python dependenciesdocker-compose.yml- Docker Compose configuration for easy execution.dockerignore- Updated to exclude unnecessary files from the Docker build
docker build -t product-matcher .docker-compose builddocker run --rm -v $(pwd)/output:/app/output product-matcherdocker-compose upThe output will be saved to output/result.json on your host machine.
To run with different parameters, you can override the command:
docker run --rm -v $(pwd)/output:/app/output product-matcher \
python3 main.py \
--client dataset/competitor_a_products.csv \
--competitor dataset/competitor_b_products.csv \
--output output/result.json \
--min-confidence 0.7 \
--sample-size 100Modify the command section in docker-compose.yml or run:
docker-compose run product-matcher python3 main.py --client dataset/competitor_a_products.csv --competitor dataset/competitor_b_products.csv --output output/result.json --min-confidence 0.7 --sample-size 100The results will be written to output/result.json in the project directory (mounted as a volume from the container).
If you encounter network issues when building the image (Docker registry timeout), try:
- Check your internet connection
- Use a VPN if Docker Hub is blocked in your region
- Configure Docker to use a different registry mirror