Skip to main content

15.5 Practical Integration of Spring Batch and Quartz Scheduling

While Spring's default @Scheduled is useful for simple periodic tasks, Quartz Scheduler is essential when operating in a distributed environment (multi-server clustering) to prevent duplicate execution, handle misfires, and establish robust retry mechanisms.

In enterprise environments, Spring Batch applications handling large-scale data are almost always integrated with a scheduler like Quartz, Jenkins, or Airflow. Here, we will explore the most representative and practical configuration: Spring Batch + Quartz + DB Clustering.

1. Quartz Dependencies and Database Configuration

1.1 Add Dependency (build.gradle)

dependencies {
implementation 'org.springframework.boot:spring-boot-starter-batch'
implementation 'org.springframework.boot:spring-boot-starter-quartz' // Add Quartz starter
}

1.2 application.yml Configuration (JDBC JobStore)

We configure Quartz to store schedule information in a Database (JDBC JobStore) rather than memory (RAM). This ensures that standard batches operate as a cluster (Scale-Out) without duplicating job executions across multiple servers.

spring:
quartz:
job-store-type: jdbc # Use DB-backed scheduling over memory-backed
jdbc:
initialize-schema: never # Recommended: Manually execute Quartz SQL scripts once rather than using 'always'
properties:
org.quartz.scheduler.instanceName: MyBatchScheduler # Must be identical across all servers in the cluster
org.quartz.scheduler.instanceId: AUTO # Automatically generates a unique ID per node
org.quartz.jobStore.isClustered: true # Enable multi-server clustering
org.quartz.jobStore.clusterCheckinInterval: 20000
org.quartz.jobStore.driverDelegateClass: org.quartz.impl.jdbcjobstore.StdJDBCDelegate # For MySQL/MariaDB (Use PostgreSQLDelegate for PostgreSQL)

2. Job and Trigger Configuration

The core of Quartz lies in the JobDetail (what to execute) and Trigger (when to execute). We need a QuartzJobBean implementation acting as a bridge to launch a Spring Batch Job.

2.1 Batch Job Executor (QuartzJobBean Implementation)

This acts as a bridge; it executes at a scheduled time and programmatically triggers the Spring Batch JobLauncher.

@Slf4j
@Component
@RequiredArgsConstructor
public class DailyReportQuartzJob extends QuartzJobBean {

private final JobLauncher jobLauncher;
private final JobLocator jobLocator; // Locates the registered Batch Job by its name

@Override
protected void executeInternal(JobExecutionContext context) throws JobExecutionException {
try {
// 1. Retrieve the Batch Job name (passed as a parameter during Trigger config)
String jobName = context.getMergedJobDataMap().getString("jobName");
Job job = jobLocator.getJob(jobName);

// 2. Create Batch Job Parameters (adding a timestamp avoids DuplicateJobException)
JobParameters params = new JobParametersBuilder()
.addString("jobID", String.valueOf(System.currentTimeMillis()))
.addString("requestDate", java.time.LocalDate.now().toString())
.toJobParameters();

// 3. Launch the Batch Job
log.info("Starting Batch Job: {}", jobName);
JobExecution jobExecution = jobLauncher.run(job, params);
log.info("Batch Job {} finished with status: {}", jobName, jobExecution.getStatus());

} catch (Exception e) {
log.error("Failed to execute Batch Job via Quartz", e);
throw new JobExecutionException("Quartz Job execution failed", e);
}
}
}

2.2 Registering the Quartz Scheduler Beans

The final step is to register the DailyReportQuartzJob to run every day at 2:00 AM using a cron expression.

@Configuration
public class QuartzConfig {

public static final String DAILY_REPORT_JOB_NAME = "dailyReportBatchJob";

// 1. JobDetail: Defines the exact class to execute.
@Bean
public JobDetailFactoryBean dailyReportJobDetail() {
JobDetailFactoryBean jobDetailFactory = new JobDetailFactoryBean();
jobDetailFactory.setJobClass(DailyReportQuartzJob.class);
jobDetailFactory.setDescription("Batch Job that generates daily reports at 2 AM");
jobDetailFactory.setDurability(true); // Keeps the Job in DB even if no Trigger points to it

// Pass the actual Spring Batch Job name through JobDataMap
Map<String, Object> jobDataMap = new HashMap<>();
jobDataMap.put("jobName", DAILY_REPORT_JOB_NAME);
jobDetailFactory.setJobDataAsMap(jobDataMap);

return jobDetailFactory;
}

// 2. Trigger: Defines when to execute (Cron Expression).
@Bean
public CronTriggerFactoryBean dailyReportJobTrigger(@Qualifier("dailyReportJobDetail") JobDetail jobDetail) {
CronTriggerFactoryBean trigger = new CronTriggerFactoryBean();
trigger.setJobDetail(jobDetail);
trigger.setCronExpression("0 0 2 * * ?"); // Every day at 02:00:00 AM
trigger.setDescription("Daily report batch trigger");
return trigger;
}
}

3. Principles of Duplicate Execution Prevention in Clustered Environments

When Quartz runs on multiple redundantly configured servers, a critical issue occurs if all servers trigger the batch simultaneously at 2:00 AM, duplicating the data processing. Here is the mechanism and countermeasure to prevent this.

3.1 Preemption via Database Row Locks

Quartz utilizes the Database (specifically the QRTZ_LOCKS table among 11 schema tables) as a shared state repository instead of RAM. When the clock strikes 2:00 AM, all Quartz instances across the servers simultaneously query the QRTZ_LOCKS table using a pessimistic lock (SELECT ... FOR UPDATE). Only the very first server to acquire the row lock transitions the Trigger state to 'Acquired/Fired' and takes ownership of the Job. The other servers wait or fail to acquire the lock and gracefully skip that execution timeframe. This guarantees perfect prevention of duplicate execution.

3.2 Mandatory Configurations and Annotations

// Mandatory addition at the top of the QuartzJobBean implementation
@DisallowConcurrentExecution
@Slf4j
@Component
@RequiredArgsConstructor
public class DailyReportQuartzJob extends QuartzJobBean {
// ... existing code ...
}
  • @DisallowConcurrentExecution: Prevents multiple instances of the same JobDetail from executing concurrently. If the 2:00 AM batch is still running at 2:05 AM, and a new trigger occurs, it forces the new execution to wait until the current one successfully finishes.
  • Time Synchronization (NTP): All servers forming the cluster must have their system times strictly synchronized via NTP. Even a small discrepancy across server clocks can compromise the locking mechanism (Time deviation should be strictly < 1 second).
  • Identical instanceName: As seen in the application.yml config, all cluster nodes must share the exact same org.quartz.scheduler.instanceName to be grouped into a single logical cluster. Conversely, instanceId must be set to AUTO to prevent conflicts.

4. Practical Architecture Tips (Pro Tip)

  1. Beware of Schema Initialization: Using spring.quartz.jdbc.initialize-schema: always will drop and recreate Quartz tables every time your server restarts, obliterating your scheduled data. In production, always set this to never and manually run the official SQL schema script available on the Quartz website.
  2. Clustered Environment Assurances: Thanks to the isClustered: true option, even if you have 3 instances of your API server running simultaneously, the Quartz lock mechanism guarantees the 2 AM batch will run on ** exactly one**node, avoiding duplicate logic processing.
  3. Misfire Handling: What happens if your server crashes right at 1:59 AM? Upon restarting, Quartz detects a "Misfire". You can configure instructions to dictate whether Quartz should run the missed batch immediately upon startup (withMisfireHandlingInstructionFireAndProceed) or entirely skip it. Understanding and configuring misfire policies is critical for robust enterprise systems.