Skip to content

Latest commit

 

History

History
322 lines (239 loc) · 10.1 KB

File metadata and controls

322 lines (239 loc) · 10.1 KB

RQ Worker File Processing - Automated Testing & Cleanup Verification

This document describes the automated testing infrastructure created to verify RQ worker file processing and cleanup code.

Overview

The testing suite verifies that:

  1. File processing jobs execute correctly through RQ workers
  2. Database session cleanup (Session.remove()) is properly called in finally blocks
  3. Embedding function cleanup (EMBEDDING_FUNCTION = None) is performed after jobs
  4. Resources are properly released after job completion or failure

Test Files Created

1. Comprehensive Pytest Test Suite

Location: backend/open_webui/test/apps/webui/workers/test_file_processor.py

Features:

  • Unit tests for cleanup verification
  • Mock-based tests (no Redis/DB required)
  • Integration tests for code structure verification
  • Resource leak detection tests

Test Categories:

  • TestFileProcessorCleanup: Tests cleanup execution on success and error paths
  • TestFileProcessorIntegration: Tests code structure and patterns
  • TestCleanupResourceLeaks: Tests cleanup across multiple jobs

2. Standalone Verification Script

Location: test_worker_cleanup_verification.py

Features:

  • Quick verification without pytest
  • Code structure verification
  • Mock execution tests
  • Optional Redis/DB connection tests

3. Documentation

Location: backend/open_webui/test/apps/webui/workers/README.md

Content:

  • Usage instructions
  • Test categories explained
  • Environment variable documentation
  • Troubleshooting guide

Quick Start

Using Conda Environment (rit4test)

# Activate conda environment
conda activate rit4test

# Option 1: Run pytest tests (recommended)
pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py -v

# Option 2: Run standalone verification script
python test_worker_cleanup_verification.py

Using Docker (Optional)

If you have Redis and PostgreSQL running:

# Set environment variables
export REDIS_URL=redis://localhost:6379
export DATABASE_URL=postgresql://user:pass@localhost:5432/dbname
export ENABLE_JOB_QUEUE=true

# Run tests
pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py -v

What Gets Tested

1. Database Session Cleanup

Verifies:

  • Session.remove() is called in the finally block
  • Cleanup happens on successful job completion
  • Cleanup happens even when job fails with exception
  • Multiple jobs each trigger cleanup independently

Test Methods:

  • test_session_cleanup_called_on_success
  • test_session_cleanup_called_on_error
  • test_multiple_jobs_cleanup

2. Embedding Function Cleanup

Verifies:

  • EMBEDDING_FUNCTION is set to None after job completion
  • Per-job resources are properly released
  • Cleanup pattern is correct in code structure

Test Methods:

  • test_embedding_function_cleanup
  • test_mock_request_cleanup_pattern

3. Code Structure Verification

Verifies:

  • finally: block exists in process_file_job
  • Session.remove() is in the finally block
  • EMBEDDING_FUNCTION cleanup is in the finally block
  • Proper imports are present

Test Methods:

  • test_file_processing_job_structure
  • test_session_import

Test Execution Examples

Run All Tests

pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py -v

Run Specific Test Class

pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py::TestFileProcessorCleanup -v

Run Specific Test

pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py::TestFileProcessorCleanup::test_session_cleanup_called_on_success -v

Run with Coverage

pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py \
    --cov=open_webui.workers.file_processor \
    --cov-report=html \
    --cov-report=term

Run Standalone Script

python test_worker_cleanup_verification.py

Expected Output

Successful Test Run

================================================================================
RQ Worker File Processing - Cleanup Verification Tests
================================================================================
Python: 3.x.x
Working directory: /path/to/project
Backend path: /path/to/project/backend

================================================================================
TEST 1: Verifying Cleanup Code Structure
================================================================================
✅ Found 'finally:' block
✅ Found Session.remove() in finally block (line 1428)
✅ Found EMBEDDING_FUNCTION cleanup in finally block (line 1439)
✅ All cleanup code structure checks passed

================================================================================
TEST 2: Verifying Session Import
================================================================================
✅ Session imported and has remove() method
✅ Session import verified in file_processor.py

================================================================================
TEST 3: Verifying Cleanup Execution (Mocked)
================================================================================
✅ Session.remove() was called in finally block

================================================================================
TEST SUMMARY
================================================================================
✅ PASS: Cleanup Code Structure
✅ PASS: Session Import
✅ PASS: Cleanup Execution (Mocked)
✅ PASS: End-to-End with Redis (Optional)
================================================================================
Results: 4/4 tests passed
✅ ALL TESTS PASSED - Cleanup code is properly implemented!

Pytest Output

backend/open_webui/test/apps/webui/workers/test_file_processor.py::TestFileProcessorCleanup::test_session_cleanup_called_on_success PASSED
backend/open_webui/test/apps/webui/workers/test_file_processor.py::TestFileProcessorCleanup::test_session_cleanup_called_on_error PASSED
backend/open_webui/test/apps/webui/workers/test_file_processor.py::TestFileProcessorCleanup::test_embedding_function_cleanup PASSED
backend/open_webui/test/apps/webui/workers/test_file_processor.py::TestFileProcessorIntegration::test_file_processing_job_structure PASSED
backend/open_webui/test/apps/webui/workers/test_file_processor.py::TestFileProcessorIntegration::test_mock_request_cleanup_pattern PASSED
backend/open_webui/test/apps/webui/workers/test_file_processor.py::TestCleanupResourceLeaks::test_multiple_jobs_cleanup PASSED

======================== 6 passed in 2.34s ========================

Environment Variables

The tests use these environment variables (optional):

  • REDIS_URL - Redis connection URL (for integration tests)
  • DATABASE_URL - PostgreSQL connection URL (for integration tests)
  • ENABLE_JOB_QUEUE - Enable job queue (default: False)
  • RAG_EMBEDDING_ENGINE - Embedding engine configuration
  • RAG_EMBEDDING_MODEL - Embedding model configuration

Note: Most tests use mocks and don't require actual Redis/DB connections. Only integration tests (marked with @pytest.mark.integration) require these.

Test Architecture

Mock-Based Testing

Most tests use Python's unittest.mock to:

  • Mock database operations (Files, Users models)
  • Mock storage operations (Storage.get_file)
  • Mock vector DB operations (VECTOR_DB_CLIENT)
  • Mock embedding function calls
  • Track cleanup method calls

This allows tests to:

  • Run without Redis/DB setup
  • Run quickly
  • Verify cleanup logic without side effects
  • Test error paths safely

Cleanup Verification Methods

  1. Patch and Assert: Patch Session.remove() and assert it was called
  2. Source Code Inspection: Parse source code to verify structure
  3. Log Monitoring: Capture and verify cleanup log messages
  4. Resource Tracking: Track resource usage across multiple jobs

Troubleshooting

Import Errors

Problem: ModuleNotFoundError: No module named 'open_webui'

Solution:

  1. Ensure you're in the project root directory
  2. Verify backend is in Python path
  3. Activate conda environment: conda activate rit4test
  4. Install dependencies: pip install -r backend/requirements.txt

Redis Connection Errors

Problem: Redis connection failures in tests

Solution:

  • These are expected if Redis is not running
  • Tests will skip Redis-dependent tests automatically
  • For full integration testing, start Redis: docker run -d -p 6379:6379 redis

Database Connection Errors

Problem: Database connection failures

Solution:

  • Most tests use mocks and don't need a database
  • Only integration tests need a real database
  • Set DATABASE_URL environment variable if running integration tests

Test Failures

Problem: Tests fail with "Session.remove() was NOT called"

Solution:

  1. Verify cleanup code exists in backend/open_webui/workers/file_processor.py
  2. Check that finally block contains Session.remove()
  3. Ensure imports are correct: from open_webui.internal.db import Session

Integration with CI/CD

To integrate these tests into CI/CD:

# Example GitHub Actions workflow
- name: Test RQ Worker Cleanup
  run: |
    conda activate rit4test
    pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py -v

Or with Docker:

- name: Test RQ Worker Cleanup
  run: |
    docker-compose up -d redis postgres
    pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py -v

Next Steps

  1. Run the tests to verify cleanup code works:

    conda activate rit4test
    pytest backend/open_webui/test/apps/webui/workers/test_file_processor.py -v
  2. Review test output to ensure all tests pass

  3. Run standalone verification for quick check:

    python test_worker_cleanup_verification.py
  4. Monitor in production to ensure cleanup prevents resource leaks

Summary

✅ Comprehensive test suite created ✅ Cleanup verification implemented ✅ Mock-based testing (no external dependencies required) ✅ Documentation and usage guides provided ✅ Ready for CI/CD integration

The testing infrastructure is ready to use and will help ensure that cleanup code prevents resource leaks in RQ worker file processing!