Phase 2 adds cloud storage integration, enhanced error handling with exponential backoff, and structured logging to the job instruction downloader.
- OAuth Authentication: Secure authentication with Google Drive API
- Folder Management: Automatic creation of department-based folder structure
- File Upload: Upload documents with proper naming and duplicate checking
- Error Handling: Robust error handling for API operations
- Create Google Cloud Project and enable Drive API
- Download credentials.json from Google Cloud Console
- Place credentials.json in
config/directory - Run authentication flow on first use
- Exponential Backoff: Configurable retry logic with exponential delays
- Jitter: Random delay variation to prevent thundering herd
- Decorator Support: Easy retry decoration for functions
- Configurable: Retry attempts, delays, and backoff settings
{
"error_handling": {
"retry_attempts": 3,
"retry_delay": 5,
"exponential_backoff": true,
"refresh_page_on_error": true,
"skip_on_repeated_failure": true
}
}- JSON Format: Machine-readable log entries
- Operation Tracking: Context-aware logging with operation metadata
- Timed Operations: Automatic duration tracking
- File Rotation: Configurable log file size and backup management
- Department and document context in logs
- Operation duration tracking
- Structured error reporting
- Configurable log levels and output
New fields for cloud storage tracking:
local_path: Local file pathcloud_status: Upload statuscloud_file_id: Google Drive file IDdownload_timestamp: Download completion timeupload_timestamp: Upload completion time
downloader = DocumentDownloader(config)
downloader.setup_cloud_storage()
success = downloader.upload_to_cloud(
file_path="/path/to/document.docx",
department_name="Отдел кадров",
document_title="Должностная инструкция менеджера"
)error_handler = EnhancedErrorHandler(config)
result = error_handler.retry_with_backoff(
risky_operation,
arg1, arg2,
keyword_arg="value"
)structured_logger = StructuredLogger(config)
with structured_logger.timed_operation(
logger, "info", "Processing document",
operation="document_processing",
department="HR",
document_title="Job Description"
):
# Process document
pass{
"cloud_storage": {
"default_provider": "google_drive",
"create_folders_automatically": true,
"check_duplicates": true,
"credentials_path": "config/credentials.json",
"root_folder_name": "Job Instructions",
"cleanup_after_upload": false
}
}{
"logging": {
"file_path": "logs/app.log",
"max_file_size": "10MB",
"backup_count": 5,
"console_output": true,
"structured_logging": true,
"detailed_logs": {
"cloud_operations": true
}
}
}Comprehensive test suite includes:
- Unit tests for all new components
- Integration tests for complete workflows
- Mock-based testing for external APIs
- Russian character handling tests
Run tests:
pytest job_instruction_downloader/tests/ -vNew dependencies added:
google-api-python-client: Google Drive APIgoogle-auth-httplib2: Authenticationgoogle-auth-oauthlib: OAuth flow
- OAuth tokens stored securely in
config/token.json - Credentials file excluded from version control
- Minimal required API scopes (
drive.file)
- Exponential backoff prevents API rate limiting
- Duplicate checking reduces unnecessary uploads
- Structured logging optimized for performance
- Optional local file cleanup after upload