Pro Tips — Multi-threading and Parallel Processing for Higher Batch Throughput
With Spring Batch's default single-threaded configuration, processing 500 million records can take hours. These three parallelization strategies can dramatically improve batch throughput.
1. Multi-threaded Step (Basic Multithreading)
The simplest first step: inject a TaskExecutor into the Step to process chunks in parallel threads.
@Bean
public Step parallelChunkStep() {
return new StepBuilder("parallelChunkStep", jobRepository)
.<Coupon, Coupon>chunk(1000, transactionManager)
.reader(couponItemReader()) // ★ Must be thread-safe (JdbcPagingItemReader recommended)
.processor(couponItemProcessor())
.writer(couponItemWriter())
.taskExecutor(new SimpleAsyncTaskExecutor()) // Separate thread per chunk
.throttleLimit(8) // Max concurrent threads
.build();
}
warning
In a multi-threaded Step, the ItemReader MUST be thread-safe. JdbcCursorItemReader is NOT thread-safe — either wrap it in SynchronizedItemStreamReader or use the thread-safe JdbcPagingItemReader.
2. Parallel Steps (Concurrent Step Execution)
Use Flow and split() to run independent Steps simultaneously.
@Bean
public Job parallelStepsJob() {
Flow flow1 = new FlowBuilder<SimpleFlow>("flow1")
.start(processEmailsStep())
.build();
Flow flow2 = new FlowBuilder<SimpleFlow>("flow2")
.start(processSmsStep())
.build();
Flow parallelFlow = new FlowBuilder<SimpleFlow>("parallelFlow")
.split(new SimpleAsyncTaskExecutor())
.add(flow1, flow2) // Run both flows in parallel
.build();
return new JobBuilder("parallelStepsJob", jobRepository)
.start(parallelFlow)
.end()
.build();
}
3. Partitioning (Split Data Range then Process in Parallel)
The most powerful technique: divide the full dataset into partitions (ranges) and have separate Worker Steps process each partition independently.
@Bean
public Step partitionedStep(Step workerStep) {
return new StepBuilder("partitionedStep", jobRepository)
.partitioner("workerStep", new ColumnRangePartitioner()) // Split by ID range
.step(workerStep)
.gridSize(10) // Divide into 10 partitions
.taskExecutor(new SimpleAsyncTaskExecutor())
.build();
}
| Strategy | Characteristics | Best For |
|---|---|---|
| Multi-threaded Step | Parallel chunk processing | Single data source, fast to implement |
| Parallel Steps | Simultaneous independent steps | Independent tasks like email+SMS |
| Partitioning | Split data range then parallelize | Hundreds of millions of records |