- Overview
- Core Components
- Job Executor
- Task Executor
- Task Registry and Task System
- Complete Execution Flow
- Key Concepts
- Configuration
The task-manager-service module is a distributed job scheduling and execution system built on Spring Boot. It provides a robust framework for:
- Job Scheduling: Creating and managing jobs with priorities, retry logic, and scheduled execution times
- Task Execution: Executing business logic tasks in a controlled, transactional environment
- Concurrent Processing: Managing multiple jobs concurrently using thread pools
- Task Discovery: Auto-registering tasks using Spring's dependency injection and annotations
Represents a job in the system with the following key attributes:
jobId: Unique identifier (UUID)workerId: ID of the worker thread assigned to execute the jobworkerLockTime: Timestamp when the job was locked to a workerassignedTaskName: Name of the task to execute (e.g., "ONE_OFF_TASK")assignedTaskStartTime: When the task should start executingjobData: JSON data payload for the taskretryAttemptsRemaining: Number of retry attempts leftpriority: Priority level (1-10, where 1 is highest priority)
A data transfer object that carries execution context:
jobData: JSON payloadassignedTaskName: Task nameassignedTaskStartTime: Scheduled start timeexecutionStatus: Current status (STARTED, INPROGRESS, COMPLETED)shouldRetry: Whether the task should be retried on failurepriority: Optional priority (1-10)
Service for creating new jobs:
executeWith(ExecutionInfo): Creates a new job from ExecutionInfo and persists it to the database
The JobExecutor is the scheduler/coordinator component that:
- Polls the database for unassigned jobs
- Assigns jobs to worker threads
- Manages a thread pool for concurrent execution
- Handles job locking to prevent duplicate execution
@Component
public class JobExecutor {
private ThreadPoolTaskExecutor executor; // Thread pool for workers
private JobService jobService; // Database operations
private TaskRegistry taskRegistry; // Task lookup
}- Uses
@Scheduledannotation to poll every 5 seconds (configurable) - Queries database for unassigned jobs (
worker_id IS NULL) - Orders by priority ASC (1 first), then by
assignedTaskStartTime ASC - Limits batch size (default: 50 jobs per poll)
- Core Pool Size: 5 threads (always active)
- Max Pool Size: 10 threads (can scale up)
- Queue Capacity: 100 jobs (queued before rejection)
- All configurable via
application.properties
1. Poll Database
└─> Query: SELECT * FROM jobs
WHERE worker_id IS NULL
AND assigned_task_start_time <= NOW()
ORDER BY priority ASC, assigned_task_start_time ASC
LIMIT batchSize
2. For Each Unassigned Job:
├─> Generate unique workerId (UUID)
├─> Lock job: SET worker_id = workerId, worker_lock_time = NOW()
└─> Submit to thread pool: executor.execute(() -> executeJob(job))
3. Execute Job (in separate thread)
└─> Create TaskExecutor and submit to thread pool
- If job assignment fails, decrements retry attempts
- Logs errors for monitoring
- Continues processing other jobs
All aspects of JobExecutor are configurable:
job.executor.poll-interval=5000 # Poll every 5 seconds
job.executor.core-pool-size=5 # Core threads
job.executor.max-pool-size=10 # Max threads
job.executor.queue-capacity=100 # Queue size
job.executor.batch-size=50 # Jobs per poll
job.executor.thread-name-prefix=job-executor- # Thread names
job.executor.wait-for-tasks-on-shutdown=true # Graceful shutdown
job.executor.await-termination-seconds=60 # Shutdown timeoutThe TaskExecutor is the worker component that:
- Executes individual tasks in a transactional context
- Handles task execution logic
- Manages retry mechanisms
- Updates job state based on execution results
public class TaskExecutor implements Runnable {
private Job job; // Job to execute
private TaskRegistry taskRegistry; // Task lookup
private JobService jobService; // Database operations
private TransactionTemplate transactionTemplate; // Transaction management
}Optional<ExecutableTask> task = taskRegistry.getTask(job.getAssignedTaskName());- Retrieves the task implementation from
TaskRegistry - Returns
Optional.empty()if task not found
if (isStartTimeOfTask(executionInfo)) {
executeTask(task, executionInfo);
} else {
releaseJob(jobId); // Too early, release for later
}- Checks if
assignedTaskStartTime <= now() - If too early, releases the job lock so it can be picked up later
transactionTemplate.execute(() -> {
ExecutionInfo result = task.execute(executionInfo);
// Process result...
});- Entire task execution runs in a database transaction
- Automatic rollback on exceptions
- Ensures data consistency
Status: INPROGRESS
- Task needs to continue (workflow scenario)
- Updates job with new task details
- Releases job lock for next execution cycle
Status: COMPLETED
- Task finished successfully
- Deletes job from database
Status: INPROGRESS + shouldRetry = true
- Task failed but should retry
- Checks if retries available
- Schedules retry with exponential backoff
A task is retried if:
executionResponse.isShouldRetry() == trueretryAttemptsRemaining > 0- Task has
getRetryDurationsInSecs()configured
List<Long> retryDurations = [10, 20, 30]; // seconds
int attemptsRemaining = 2; // 2 retries left
Long delay = retryDurations.get(retryDurations.size() - attemptsRemaining);
// delay = 20 seconds (second retry)
ZonedDateTime nextRetryTime = now().plusSeconds(delay);
updateNextTaskRetryDetails(jobId, nextRetryTime, attemptsRemaining - 1);Example Retry Sequence:
- Attempt 1 fails → Wait 10 seconds → Retry
- Attempt 2 fails → Wait 20 seconds → Retry
- Attempt 3 fails → Wait 30 seconds → Retry
- Attempt 4 fails → No more retries → Continue workflow or fail
- Transaction Rollback: Automatic on exceptions
- Job Lock Release: Ensures job can be retried
- Logging: Comprehensive error logging
- Graceful Degradation: Continues processing other jobs
public interface ExecutableTask {
ExecutionInfo execute(ExecutionInfo executionInfo);
default Optional<List<Long>> getRetryDurationsInSecs() {
return Optional.empty(); // No retry by default
}
}@Task("TASK_NAME")
@Component
public class MyTask implements ExecutableTask {
@Override
public ExecutionInfo execute(ExecutionInfo executionInfo) {
// Task logic here
return executionInfo().from(executionInfo)
.withExecutionStatus(COMPLETED)
.build();
}
}The TaskRegistry is a central registry that:
- Auto-discovers tasks on application startup
- Maps task names to task implementations
- Provides task lookup for
TaskExecutor - Manages retry configuration
@PostConstruct
public void autoRegisterTasks() {
// Spring provides all ExecutableTask beans
for (ExecutableTask taskProxy : taskBeanProxy) {
Class<?> actualClass = AopUtils.getTargetClass(taskProxy);
Task annotation = actualClass.getAnnotation(Task.class);
if (annotation != null) {
String taskName = annotation.value();
taskProxyByNameMap.put(taskName, taskProxy);
}
}
}How It Works:
- Spring scans for all
@Componentbeans implementingExecutableTask TaskRegistryinspects each bean for@Taskannotation- Extracts task name from annotation value
- Registers task in internal map:
Map<String, ExecutableTask>
Optional<ExecutableTask> getTask(String taskName) {
return Optional.ofNullable(taskProxyByNameMap.get(taskName));
}Integer findRetryAttemptsRemainingFor(String taskName) {
return getTask(taskName)
.map(task -> task.getRetryDurationsInSecs().map(List::size).orElse(0))
.orElse(0);
}Example:
- Task has
getRetryDurationsInSecs() = [10, 20, 30] - Returns
3(number of retry attempts available)
@Task("ONE_OFF_TASK")
@Component
public class OneOffTask implements ExecutableTask {
@Override
public ExecutionInfo execute(ExecutionInfo executionInfo) {
logger.info("Executing one-off task");
// Process the task
JsonObject jobData = executionInfo.getJobData();
// ... business logic ...
// Return completion status
return executionInfo().from(executionInfo)
.withExecutionStatus(COMPLETED)
.build();
}
// No retry configuration = no retries
}@Task("ONE_OFF_TASK_WITH_RETRY")
@Component
public class OneOffTaskWithRetry implements ExecutableTask {
@Override
public ExecutionInfo execute(ExecutionInfo executionInfo) {
logger.info("Executing task with retry capability");
// Simulate failure
return executionInfo().from(executionInfo)
.withExecutionStatus(INPROGRESS)
.withShouldRetry(true) // Signal retry needed
.build();
}
@Override
public Optional<List<Long>> getRetryDurationsInSecs() {
return Optional.of(List.of(10L, 20L, 30L)); // 3 retry attempts
}
}┌─────────────────────────────────────────────────────────────────┐
│ Application Startup │
│ │
│ 1. Spring scans for @Component beans │
│ 2. TaskRegistry.autoRegisterTasks() discovers all tasks │
│ 3. Tasks registered in Map<String, ExecutableTask> │
│ 4. JobExecutor initializes thread pool │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Job Creation (ExecutionService) │
│ │
│ ExecutionService.executeWith(ExecutionInfo) │
│ ├─> Get retry attempts from TaskRegistry │
│ ├─> Create Job entity with priority │
│ └─> Persist to database │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Job Executor - Scheduled Polling │
│ │
│ @Scheduled(every 5 seconds) │
│ ├─> Query: Unassigned jobs │
│ │ ORDER BY priority ASC, start_time ASC │
│ │ LIMIT batchSize │
│ │ │
│ ├─> For each job: │
│ │ ├─> Generate workerId │
│ │ ├─> Lock job (SET worker_id, worker_lock_time) │
│ │ └─> Submit to thread pool │
│ │ │
│ └─> Continue polling... │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Task Executor - Worker Thread │
│ │
│ TaskExecutor.run() │
│ ├─> Get task from TaskRegistry │
│ ├─> Check start time (isStartTimeOfTask) │
│ │ │
│ └─> Transaction: │
│ ├─> Execute task.execute(executionInfo) │
│ │ │
│ ├─> Process result: │
│ │ ├─> COMPLETED → Delete job │
│ │ ├─> INPROGRESS → Update & release │
│ │ └─> INPROGRESS + retry → Schedule retry │
│ │ │
│ └─> Commit or Rollback │
└─────────────────────────────────────────────────────────────────┘
// Client code
ExecutionInfo info = new ExecutionInfo(
jobData, "ONE_OFF_TASK", now(), STARTED, false, 5 // priority 5
);
executionService.executeWith(info);
// ExecutionService
Job job = new Job(jobId, jobData, "ONE_OFF_TASK",
startTime, null, null, retryAttempts, 5);
jobService.insertJob(job); // Persist to database
// Job persisted with assigned_task_name and assigned_task_start_time columns// JobExecutor.checkAndAssignJobs() - runs every 5 seconds
List<Job> jobs = jobService.getUnassignedJobs(50);
// Returns jobs ordered by: priority ASC, start_time ASCUUID workerId = UUID.randomUUID();
Job assignedJob = jobService.assignJobToWorker(jobId, workerId);
// Sets worker_id and worker_lock_time in database// TaskExecutor.run()
Optional<ExecutableTask> task = taskRegistry.getTask("ONE_OFF_TASK");
ExecutionInfo result = task.get().execute(executionInfo);if (result.getExecutionStatus() == COMPLETED) {
jobService.deleteJob(jobId); // Job finished
} else if (result.getExecutionStatus() == INPROGRESS) {
if (canRetry(task, result)) {
scheduleRetry(task); // Retry with delay
} else {
updateJobAndRelease(result); // Continue workflow
}
}- Purpose: Prevent multiple workers from executing the same job
- Mechanism:
worker_idandworker_lock_timecolumns - Process:
- Job assigned →
worker_idset,worker_lock_time= NOW() - Job completed/released →
worker_id= NULL,worker_lock_time= NULL
- Job assigned →
- Range: 1 (highest) to 10 (lowest)
- Ordering: Priority first, then start time
- Use Case: Critical jobs execute before normal jobs
- Configuration: Task provides
getRetryDurationsInSecs() - Execution: Task sets
shouldRetry = trueandstatus = INPROGRESS - Scheduling: Exponential backoff based on retry durations
- Tracking:
retryAttemptsRemainingdecrements on each retry
- Scope: Entire task execution runs in a transaction
- Rollback: Automatic on exceptions
- Isolation: Each task execution is isolated
- Thread Pool: Configurable size (5-10 threads)
- Queue: Buffers jobs before execution
- Isolation: Each job executes independently
- Auto-Registration: Spring DI +
@PostConstruct - Annotation-Based:
@Task("NAME")identifies tasks - Type Safety: Compile-time task name checking
# Job Executor Configuration
job.executor.poll-interval=5000 # Milliseconds
job.executor.core-pool-size=5 # Core threads
job.executor.max-pool-size=10 # Max threads
job.executor.queue-capacity=100 # Queue size
job.executor.batch-size=50 # Jobs per poll
job.executor.thread-name-prefix=job-executor- # Thread naming
job.executor.wait-for-tasks-on-shutdown=true # Graceful shutdown
job.executor.await-termination-seconds=60 # Shutdown timeoutThe system uses PostgreSQL with:
- Table:
jobs - Columns:
assigned_task_name: Name of the task to execute (TEXT)assigned_task_start_time: Scheduled start time (TIMESTAMP WITH TIME ZONE)
- Indexes: On
worker_id,assigned_task_start_time,priority - Flyway: Schema management via migration scripts
The task-manager-service module provides a robust, scalable job scheduling and execution framework:
- JobExecutor: Schedules and coordinates job execution
- TaskExecutor: Executes tasks in transactional context
- TaskRegistry: Auto-discovers and manages task implementations
- ExecutableTask: Interface for business logic tasks
- Priority System: Ensures high-priority jobs execute first
- Retry Logic: Handles failures with configurable backoff
- Concurrent Processing: Thread pool for parallel execution
- Transaction Safety: ACID compliance for job execution
The architecture is designed for:
- Scalability: Thread pool and batch processing
- Reliability: Transactions, retries, and error handling
- Flexibility: Configurable via properties
- Maintainability: Clear separation of concerns