CLI-45: Split backend into 3 apps#64
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request refactors the backend architecture from a monolithic Modal app into a microservices approach with three specialized apps (server, search, processing) for production deployments, while introducing a combined development app for local iteration. The changes optimize cold start performance by separating lightweight API operations from heavy ML workloads.
Changes:
- Split backend into three production Modal apps (server.py, search_app.py, processing_app.py) with tailored dependencies for each workload
- Created dev_combined.py for local development that combines all services in one app for hot-reload convenience
- Refactored API routing to support both direct service calls (dev mode) and cross-app communication via modal.Cls.from_name() (production mode)
- Updated GitHub Actions CI/CD workflow to deploy apps individually with proper environment variable configuration
- Enhanced CLI with commands for serving combined dev app or individual apps
- Added comprehensive test mocking for cross-app Modal communication
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/streamlit/config.py | Updated BASE_API_URL construction for new server app naming convention |
| backend/tests/integration/test_preprocessor.py | Added ffmpeg/ffprobe availability checks to skip tests when tools are missing |
| backend/tests/integration/test_pipeline.py | Split tests into TestProcessingPipeline and TestDeletionPipeline, updated to use processing_service fixture |
| backend/tests/integration/test_api_endpoints.py | Added Modal mocking infrastructure to test cross-app communication patterns |
| backend/tests/conftest.py | Created processing_service and updated server_instance fixtures for new app structure |
| backend/shared/server_service.py | Base ServerService class with lightweight connectors and delete_video_background method |
| backend/shared/search_service.py | Base SearchService class with CLIP text encoder and search method |
| backend/shared/processing_service.py | Base ProcessingService class with full video processing pipeline |
| backend/shared/images.py | Modal image definitions for dev, server, search, and processing apps with separated dependencies |
| backend/shared/config.py | Shared configuration module with environment handling and logging setup |
| backend/apps/server.py | Production server app with minimal dependencies (API gateway) |
| backend/apps/search_app.py | Production search app with CLIP text encoder |
| backend/apps/processing_app.py | Production processing app with full video pipeline |
| backend/apps/dev_combined.py | Development-only combined app with all services |
| backend/api/fastapi_router.py | Refactored router supporting both dev (direct calls) and production (cross-app) modes |
| backend/cli.py | CLI commands for serving combined dev app or individual apps |
| backend/pyproject.toml | Updated scripts for dev, server, search, and processing commands |
| backend/main.py | Removed monolithic main.py (replaced by apps/) |
| backend/README.md | Comprehensive documentation of new architecture and deployment |
| .github/workflows/cd.yml | Updated to deploy three apps individually in staging/prod |
Comments suppressed due to low confidence (2)
.github/workflows/cd.yml:75
- The production deployment workflow is missing the
environment: productionsetting for GitHub Actions. Line 72 should includeenvironment: productionto match the staging deployment configuration (line 27). This ensures proper environment protection rules and secret handling in GitHub Actions.
deploy-prod:
name: Deploy Production
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
defaults:
run:
working-directory: backend
backend/tests/integration/test_pipeline.py:62
- The test is marked with
@pytest.mark.asynciobut callsprocessing_service.process_video_background()as a synchronous function (line 57). Looking at the ProcessingService implementation in shared/processing_service.py, theprocess_video_backgroundmethod is not async. Therefore, this test should not be marked with@pytest.mark.asyncioand should not useasync def. The same issue exists for all tests in the TestProcessingPipeline class that are marked as async.
@pytest.mark.asyncio
async def test_process_video_success(self, processing_service, sample_video_bytes):
"""
Scenario: Happy path - everything succeeds.
Expectation:
- R2 upload called.
- Preprocessing called.
- Embeddings generated.
- Pinecone upsert called for all chunks.
- Job marked completed.
- Result contains correct stats.
"""
# Setup
hashed_id = "hash-success"
processing_service.r2_connector.upload_video.return_value = (True, hashed_id)
# Mock Preprocessor output
chunks = [
{
"chunk_id": "c1",
"frames": [1, 2],
"metadata": {"frame_count": 10, "complexity_score": 0.5, "timestamp_range": [0.0, 5.0]},
"memory_mb": 1.0
},
{
"chunk_id": "c2",
"frames": [3, 4],
"metadata": {"frame_count": 15, "complexity_score": 0.8, "timestamp_range": [5.0, 10.0]},
"memory_mb": 1.5
}
]
processing_service.preprocessor.process_video_from_bytes.return_value = chunks
# Mock Embedder
mock_embedding = MagicMock()
mock_embedding.numpy.return_value = [0.1, 0.2]
processing_service.video_embedder._generate_clip_embedding.return_value = mock_embedding
# Mock Pinecone success
processing_service.pinecone_connector.upsert_chunk.return_value = True
# Execute
result = processing_service.process_video_background(
video_bytes=sample_video_bytes,
filename="success.mp4",
job_id="job-success",
namespace="test-ns"
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
backend/api/fastapi_router.py
Outdated
| files (list[UploadFile]): The uploaded video files. | ||
| namespace (str, optional): Namespace for Pinecone and R2 storage (default: "") | ||
|
|
||
| Returns: | ||
| dict: Contains job_id, filename, content_type, size_bytes, status, and message. | ||
| """ |
There was a problem hiding this comment.
The upload endpoint signature was changed to accept a list of files, but only the first file is processed. According to the PR TODO, batch uploading capabilities were lost in this refactor. This endpoint should either:
- Accept a single UploadFile parameter (not a list) since batch uploading is not yet re-implemented
- Or properly handle multiple files in the list
The current implementation accepts a list but silently ignores all files except the first one, which could be confusing for API consumers.
| files (list[UploadFile]): The uploaded video files. | |
| namespace (str, optional): Namespace for Pinecone and R2 storage (default: "") | |
| Returns: | |
| dict: Contains job_id, filename, content_type, size_bytes, status, and message. | |
| """ | |
| files (list[UploadFile]): The uploaded video files. Currently, exactly one file is supported. | |
| namespace (str, optional): Namespace for Pinecone and R2 storage (default: "") | |
| Returns: | |
| dict: Contains job_id, filename, content_type, size_bytes, status, and message. | |
| """ | |
| # Validate number of uploaded files to avoid silently ignoring extra files | |
| if not files: | |
| raise HTTPException(status_code=400, detail="No files uploaded. Exactly one file must be provided.") | |
| if len(files) > 1: | |
| raise HTTPException( | |
| status_code=400, | |
| detail="Multiple files uploaded. Exactly one file is supported at this time.", | |
| ) |
This pull request introduces a significant architectural refactor of the backend to adopt a microservices approach for production deployments, while maintaining a streamlined experience for local development. The backend is now split into three specialized Modal apps—server, search, and processing—to optimize cold start performance and resource usage. Local development uses a combined app for ease of iteration. The PR also updates documentation, deployment workflows, and the CLI to reflect these changes.
Backend Architecture & App Structure:
server.py(API gateway),search_app.py(semantic search), andprocessing_app.py(video processing). Each app is tailored for its workload, reducing cold start times for lightweight operations by creating different debian images for each app (found inimages.py). A newdev_combined.pyapp is introduced for local development, combining all services for hot-reloading and direct class access.API Routing & Service Communication:
FastAPIRouterto support both local (combined app) and production (cross-app) modes. In production, service calls are made usingmodal.Cls.from_name()for cross-app communication; in dev, direct class references are used to avoid lookup overhead. The upload and search endpoints now handle both modes seamlessly.Deployment & CI/CD Updates:
cd.yml) to deploy each Modal app individually in both staging and production environments, instead of deploying a monolithic app. The workflow now runs on pushes to the appropriate branches and includes clear environment variable usage.Documentation Improvements:
backend/README.mdto document the new microservices architecture, explain the rationale for the split, and provide clear instructions for both local development and production deployment. Added tables and notes to clarify the purpose and dependencies of each app.CLI & Developer Experience:
cli.py) andpyproject.tomlscripts to support serving the combined dev app or any individual app with simple commands. Output is color-coded for clarity when running multiple services.TODO
upload()endpoint lost its batch uploading capabilities and currently in staging upload doesn't work at all. The backend isn't using theUploadHandleranymore either. There is nothing wrong on the frontend side. Need to readd batch uploading