Skip to content

Feature : Docker optimization and offline prediction#405

Merged
kshitijrajsharma merged 41 commits intodevelopfrom
feature/offline-prediction
May 27, 2025
Merged

Feature : Docker optimization and offline prediction#405
kshitijrajsharma merged 41 commits intodevelopfrom
feature/offline-prediction

Conversation

@kshitijrajsharma
Copy link
Copy Markdown
Member

@kshitijrajsharma kshitijrajsharma commented May 13, 2025

What does this PR do ?

System upgrade :

Offline Prediction :

  • Integration with fairpredictor and geoml-toolkits , both submodules have been integrated with fair-backend as all of them have python 3.10 compatibility

Cleanup :

  • Cleans the unused API endpoints

Things to watch out :

You need to make sure you upgrade your system to either use the docker image in the repo or the uv build , uv will work for API and prediction but might not work for the ramp-workers ( I strongly recommend the docker )

Migration :

Old ApprovedPrediction table is migrated to the feedback table , This migration should be carried out carefully on the production server in order not to loose data , I haven't deleted the model itself to maintain the table for first migration , will raise the PR after the deployment once migration completes to cleanup the unused tables and code related to it !

My apologies for to ship both offline prediction feature along with docker system optimization altogether making it really difficult to review , I was working on the offline prediction and this docker image optimization and python upgrade became extremely crucial in midway ! Couldn't avoid it

What next ?

Docker image size and improvement

I have managed to narrow down ramp dependencies into <7 GB , Now Next step is to manage the layers which are being installed in the application itself !

kshitijrajsharma and others added 30 commits April 25, 2025 15:31
- Updated backend API service to use a new command `runserver_with_q` and modified volume mappings to specific directories.
- Adjusted backend worker service to use the same command and added environment file reference.
- Removed commented image references for clarity.
- Enhanced the Docker Compose file for GPU support by adding build arguments.
- Changed the default RAMP_HOME path in the init script for consistency.
- Deleted the obsolete subproject directory `ramp-code` from the repository.
… Dockerfile for conditional build target, and configure log volume for backend services
@kshitijrajsharma kshitijrajsharma requested a review from Copilot May 26, 2025 12:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes Docker builds, upgrades runtime dependencies, and adds offline prediction support.

  • Revamps Dockerfiles and GitHub workflows for UV and multi-stage builds
  • Adds Prediction model, serializer, viewset, and Celery task for offline prediction
  • Removes legacy ramp dependencies and integrates external ramp-code-fair build

Reviewed Changes

Copilot reviewed 600 out of 600 changed files in this pull request and generated no comments.

Show a summary per file
File Description
backend/docker/ramp/docker-requirements.txt Removed ramp docker-specific requirements file
backend/core/views.py Refactored views: removed legacy endpoints, added UserAssignmentMixin and PredictionViewSet
backend/core/utils.py Added setup_ramp function and imports (tempfile, urllib.request, Path)
backend/core/urls.py Updated router registrations: added prediction routes, commented out obsolete paths
backend/core/tasks.py Integrated setup_ramp, disabled XLA, and added predict_area Celery task
backend/core/serializers.py Changed feedback counts to filter by action instead of legacy models
backend/core/models.py Replaced feedback_type with action, added Prediction model, cleaned up legacy tables
backend/core/admin.py Switched from OSMGeoAdmin to GISModelAdmin
backend/api-requirements.txt Deleted outdated API requirements file
backend/aiproject/settings.py Introduced USE_S3_TO_UPLOAD_MODELS and PREDICTION_WORKSPACE settings
backend/Dockerfile_CPU Removed old CPU Dockerfile
backend/Dockerfile.workers Added new worker Dockerfile using UV build
backend/Dockerfile.API Refactored API Dockerfile for UV and multi-stage build
backend/Dockerfile Removed old GPU Dockerfile
backend/.python-version Bumped Python version to 3.12
.github/workflows/test_backend_build.yml Added UV-based matrix workflow for backend tests
.github/workflows/docker_publish_image.yml Extended workflow to build/push worker and offline predictor images
.github/workflows/docker_build.yml Refactored image build workflow to use new Dockerfiles and build args
.dockerignore Expanded ignore patterns for caches, Dockerfiles, environments, and migrations
Comments suppressed due to low confidence (7)

backend/core/utils.py:591

  • [nitpick] Use the logger instead of print for setup messages (logging.info), so output is captured consistently.
print("[+] RAMP_HOME: {ramp_home}")

backend/core/tasks.py:504

  • The decorator @shared_task is used but not imported; add from celery import shared_task to the imports.
@shared_task

backend/core/views.py:872

  • Metrics are inverted: total_feedback_labels should count REJECT actions and total_approved_predictions should count ACCEPT actions.
total_feedback_labels = Feedback.objects.filter(action="ACCEPT").count()

backend/core/serializers.py:512

  • [nitpick] The method name get_feedbacks_count suggests all feedback but filters only REJECT; rename or adjust filter to match intent.
return Feedback.objects.filter(user=obj, action="REJECT").count()

backend/core/serializers.py:515

  • [nitpick] The method get_approved_predictions_count now reuses Feedback with ACCEPT but original name implies ApprovedPredictions; consider renaming for clarity.
return Feedback.objects.filter(user=obj, action="ACCEPT").count()

backend/core/views.py:29

  • [nitpick] Several legacy imports and endpoint code are commented out; remove them to clean up the codebase.
# from geojson2osm import geojson2osm

backend/core/tasks.py:363

  • [nitpick] Use logging.error(...) instead of print for error messages in background tasks.
print("[✗] 'chips' folder does not exist at:", chips_folder)

@kshitijrajsharma kshitijrajsharma marked this pull request as ready for review May 27, 2025 08:33
@kshitijrajsharma kshitijrajsharma merged commit 18db20f into develop May 27, 2025
5 checks passed
@kshitijrajsharma kshitijrajsharma deleted the feature/offline-prediction branch May 27, 2025 20:07
@kshitijrajsharma
Copy link
Copy Markdown
Member Author

kshitijrajsharma commented May 28, 2025

Needed to merge this PR asap to be able to meet deadline and sync with frontend work , I will request review in next PR with docker image report @spwoodcock

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants