Skip to main content
Advertisement

15.1 Spring Batch Architecture: Tasklet and Chunk-Oriented Processing

Spring Batch structures large-scale data processing in a controlled way — for batch jobs like month-end settlements, large CSV parsing, or mass email sending.

1. Core Architecture Overview

Spring Batch's overall flow follows: Job → Step → Chunk (Read → Process → Write).

Job
└── Step 1: Chunk-based processing
│ └── [chunk=1000]
│ ├── ItemReader : Read 1000 items at a time from DB/file
│ ├── ItemProcessor : Transform/filter each item
│ └── ItemWriter : Write 1000 items to DB/file
└── Step 2: Tasklet-based processing (simple single execution logic)
  • Job: The entire batch work unit, uniquely identified by name.
  • Step: The stage within a Job that performs actual processing.
  • Chunk-Oriented Processing: Read, process, and write in batches of n items. Each chunk commits as its own transaction — if part of a million-record job fails, earlier successful chunks are preserved.

2. Basic Setup and JobRepository

implementation 'org.springframework.boot:spring-boot-starter-batch'

Spring Batch automatically records job execution history (success/failure/restart info) to dedicated metadata tables (JobRepository). Spring Boot creates these tables automatically (spring.batch.jdbc.initialize-schema: always).

spring:
batch:
jdbc:
initialize-schema: always # Auto-create metadata tables on first run
job:
enabled: false # Prevent auto-run on startup (trigger via scheduler or API instead)

3. Simple Step with Tasklet

For single-result logic (e.g., deleting temp files, sending a notification) rather than chunk processing, implement the Tasklet interface.

@Component
public class CleanupTasklet implements Tasklet {
@Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
log.info("Running temporary file cleanup...");
// File delete logic
return RepeatStatus.FINISHED; // Done (CONTINUABLE would repeat the tasklet)
}
}
Advertisement