Skip to content

CLI-45: Split backend into 3 apps#64

Merged
eshaan-mehta merged 15 commits intostagingfrom
split-backend
Jan 21, 2026
Merged

CLI-45: Split backend into 3 apps#64
eshaan-mehta merged 15 commits intostagingfrom
split-backend

Conversation

@eshaan-mehta
Copy link
Contributor

This pull request introduces a significant architectural refactor of the backend to adopt a microservices approach for production deployments, while maintaining a streamlined experience for local development. The backend is now split into three specialized Modal apps—server, search, and processing—to optimize cold start performance and resource usage. Local development uses a combined app for ease of iteration. The PR also updates documentation, deployment workflows, and the CLI to reflect these changes.

Backend Architecture & App Structure:

  • Split the backend into three Modal apps: server.py (API gateway), search_app.py (semantic search), and processing_app.py (video processing). Each app is tailored for its workload, reducing cold start times for lightweight operations by creating different debian images for each app (found in images.py). A new dev_combined.py app is introduced for local development, combining all services for hot-reloading and direct class access.

API Routing & Service Communication:

  • Refactored FastAPIRouter to support both local (combined app) and production (cross-app) modes. In production, service calls are made using modal.Cls.from_name() for cross-app communication; in dev, direct class references are used to avoid lookup overhead. The upload and search endpoints now handle both modes seamlessly.

Deployment & CI/CD Updates:

  • Updated GitHub Actions workflow (cd.yml) to deploy each Modal app individually in both staging and production environments, instead of deploying a monolithic app. The workflow now runs on pushes to the appropriate branches and includes clear environment variable usage.

Documentation Improvements:

  • Updated backend/README.md to document the new microservices architecture, explain the rationale for the split, and provide clear instructions for both local development and production deployment. Added tables and notes to clarify the purpose and dependencies of each app.

CLI & Developer Experience:

  • Enhanced the CLI (cli.py) and pyproject.toml scripts to support serving the combined dev app or any individual app with simple commands. Output is color-coded for clarity when running multiple services.

TODO

  • In the midst of all of this, the upload() endpoint lost its batch uploading capabilities and currently in staging upload doesn't work at all. The backend isn't using the UploadHandler anymore either. There is nothing wrong on the frontend side. Need to readd batch uploading
  • Also fix the tests. There is currenly 1 API instegration test that is failing due to the upload endpoint, but for some reason, the batch uploading tests are either non existent or returning a false positive

@eshaan-mehta eshaan-mehta requested review from adithayyil and Copilot and removed request for Copilot January 20, 2026 05:08
@github-actions github-actions bot changed the base branch from main to staging January 20, 2026 05:08
@eshaan-mehta eshaan-mehta requested a review from Copilot January 20, 2026 05:08
@eshaan-mehta eshaan-mehta changed the title Split backend into 3 apps CLI-45: Split backend into 3 apps Jan 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors the backend architecture from a monolithic Modal app into a microservices approach with three specialized apps (server, search, processing) for production deployments, while introducing a combined development app for local iteration. The changes optimize cold start performance by separating lightweight API operations from heavy ML workloads.

Changes:

  • Split backend into three production Modal apps (server.py, search_app.py, processing_app.py) with tailored dependencies for each workload
  • Created dev_combined.py for local development that combines all services in one app for hot-reload convenience
  • Refactored API routing to support both direct service calls (dev mode) and cross-app communication via modal.Cls.from_name() (production mode)
  • Updated GitHub Actions CI/CD workflow to deploy apps individually with proper environment variable configuration
  • Enhanced CLI with commands for serving combined dev app or individual apps
  • Added comprehensive test mocking for cross-app Modal communication

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
frontend/streamlit/config.py Updated BASE_API_URL construction for new server app naming convention
backend/tests/integration/test_preprocessor.py Added ffmpeg/ffprobe availability checks to skip tests when tools are missing
backend/tests/integration/test_pipeline.py Split tests into TestProcessingPipeline and TestDeletionPipeline, updated to use processing_service fixture
backend/tests/integration/test_api_endpoints.py Added Modal mocking infrastructure to test cross-app communication patterns
backend/tests/conftest.py Created processing_service and updated server_instance fixtures for new app structure
backend/shared/server_service.py Base ServerService class with lightweight connectors and delete_video_background method
backend/shared/search_service.py Base SearchService class with CLIP text encoder and search method
backend/shared/processing_service.py Base ProcessingService class with full video processing pipeline
backend/shared/images.py Modal image definitions for dev, server, search, and processing apps with separated dependencies
backend/shared/config.py Shared configuration module with environment handling and logging setup
backend/apps/server.py Production server app with minimal dependencies (API gateway)
backend/apps/search_app.py Production search app with CLIP text encoder
backend/apps/processing_app.py Production processing app with full video pipeline
backend/apps/dev_combined.py Development-only combined app with all services
backend/api/fastapi_router.py Refactored router supporting both dev (direct calls) and production (cross-app) modes
backend/cli.py CLI commands for serving combined dev app or individual apps
backend/pyproject.toml Updated scripts for dev, server, search, and processing commands
backend/main.py Removed monolithic main.py (replaced by apps/)
backend/README.md Comprehensive documentation of new architecture and deployment
.github/workflows/cd.yml Updated to deploy three apps individually in staging/prod
Comments suppressed due to low confidence (2)

.github/workflows/cd.yml:75

  • The production deployment workflow is missing the environment: production setting for GitHub Actions. Line 72 should include environment: production to match the staging deployment configuration (line 27). This ensures proper environment protection rules and secret handling in GitHub Actions.
  deploy-prod:
    name: Deploy Production
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: backend

backend/tests/integration/test_pipeline.py:62

  • The test is marked with @pytest.mark.asyncio but calls processing_service.process_video_background() as a synchronous function (line 57). Looking at the ProcessingService implementation in shared/processing_service.py, the process_video_background method is not async. Therefore, this test should not be marked with @pytest.mark.asyncio and should not use async def. The same issue exists for all tests in the TestProcessingPipeline class that are marked as async.
    @pytest.mark.asyncio
    async def test_process_video_success(self, processing_service, sample_video_bytes):
        """
        Scenario: Happy path - everything succeeds.
        Expectation:
            - R2 upload called.
            - Preprocessing called.
            - Embeddings generated.
            - Pinecone upsert called for all chunks.
            - Job marked completed.
            - Result contains correct stats.
        """
        # Setup
        hashed_id = "hash-success"
        processing_service.r2_connector.upload_video.return_value = (True, hashed_id)
        
        # Mock Preprocessor output
        chunks = [
            {
                "chunk_id": "c1", 
                "frames": [1, 2], 
                "metadata": {"frame_count": 10, "complexity_score": 0.5, "timestamp_range": [0.0, 5.0]}, 
                "memory_mb": 1.0
            },
            {
                "chunk_id": "c2", 
                "frames": [3, 4], 
                "metadata": {"frame_count": 15, "complexity_score": 0.8, "timestamp_range": [5.0, 10.0]}, 
                "memory_mb": 1.5
            }
        ]
        processing_service.preprocessor.process_video_from_bytes.return_value = chunks
        
        # Mock Embedder
        mock_embedding = MagicMock()
        mock_embedding.numpy.return_value = [0.1, 0.2]
        processing_service.video_embedder._generate_clip_embedding.return_value = mock_embedding
        
        # Mock Pinecone success
        processing_service.pinecone_connector.upsert_chunk.return_value = True

        # Execute
        result = processing_service.process_video_background(
            video_bytes=sample_video_bytes,
            filename="success.mp4",
            job_id="job-success",
            namespace="test-ns"
        )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 89 to 94
files (list[UploadFile]): The uploaded video files.
namespace (str, optional): Namespace for Pinecone and R2 storage (default: "")

Returns:
dict: Contains job_id, filename, content_type, size_bytes, status, and message.
"""
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upload endpoint signature was changed to accept a list of files, but only the first file is processed. According to the PR TODO, batch uploading capabilities were lost in this refactor. This endpoint should either:

  1. Accept a single UploadFile parameter (not a list) since batch uploading is not yet re-implemented
  2. Or properly handle multiple files in the list

The current implementation accepts a list but silently ignores all files except the first one, which could be confusing for API consumers.

Suggested change
files (list[UploadFile]): The uploaded video files.
namespace (str, optional): Namespace for Pinecone and R2 storage (default: "")
Returns:
dict: Contains job_id, filename, content_type, size_bytes, status, and message.
"""
files (list[UploadFile]): The uploaded video files. Currently, exactly one file is supported.
namespace (str, optional): Namespace for Pinecone and R2 storage (default: "")
Returns:
dict: Contains job_id, filename, content_type, size_bytes, status, and message.
"""
# Validate number of uploaded files to avoid silently ignoring extra files
if not files:
raise HTTPException(status_code=400, detail="No files uploaded. Exactly one file must be provided.")
if len(files) > 1:
raise HTTPException(
status_code=400,
detail="Multiple files uploaded. Exactly one file is supported at this time.",
)

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adithayyil this is the main todo

@eshaan-mehta eshaan-mehta marked this pull request as ready for review January 21, 2026 05:14
@eshaan-mehta eshaan-mehta merged commit ae37129 into staging Jan 21, 2026
6 checks passed
Copy link
Member

@adithayyil adithayyil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm brochacho

Copy link
Member

@adithayyil adithayyil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm brochacho

@eshaan-mehta eshaan-mehta deleted the split-backend branch January 26, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants