Skip to main content

15.1 Spring Batch Architecture: Tasklet and Chunk-Oriented Processing

Spring Batch structures large-scale data processing in a controlled way — for batch jobs like month-end settlements, large CSV parsing, or mass email sending.

1. Core Architecture Overview

Spring Batch's overall flow follows: Job → Step → Chunk (Read → Process → Write).

Job
 └── Step 1: Chunk-based processing
 │    └── [chunk=1000]
 │         ├── ItemReader : Read 1000 items at a time from DB/file
 │         ├── ItemProcessor : Transform/filter each item
 │         └── ItemWriter : Write 1000 items to DB/file
 └── Step 2: Tasklet-based processing (simple single execution logic)

Job: The entire batch work unit, uniquely identified by name.
Step: The stage within a Job that performs actual processing.
Chunk-Oriented Processing: Read, process, and write in batches of n items. Each chunk commits as its own transaction — if part of a million-record job fails, earlier successful chunks are preserved.

2. Basic Setup and JobRepository

implementation 'org.springframework.boot:spring-boot-starter-batch'

Spring Batch automatically records job execution history(success/failure/restart info) to dedicated metadata tables (JobRepository). Spring Boot creates these tables automatically (spring.batch.jdbc.initialize-schema: always).

spring:
  batch:
    jdbc:
      initialize-schema: always  # Auto-create metadata tables on first run
    job:
      enabled: false  # Prevent auto-run on startup (trigger via scheduler or API instead)

3. Simple Step with Tasklet

For single-result logic (e.g., deleting temp files, sending a notification) rather than chunk processing, implement the Tasklet interface.

@Component
public class CleanupTasklet implements Tasklet {
    @Override
    public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
        log.info("Running temporary file cleanup...");
        // File delete logic
        return RepeatStatus.FINISHED; // Done (CONTINUABLE would repeat the tasklet)
    }
}

1. Core Architecture Overview
2. Basic Setup and JobRepository
3. Simple Step with Tasklet