Spring Batch,
is an open source framework for batch processing – execution of a
series of jobs. Spring Batch provides classes and APIs to read/write
resources, transaction management, job processing statistics, job
restart and partitioning techniques to process high-volume of data.
Spring Batch is a framework for batch processing – execution of a series of jobs. In Spring Batch, A job consists of many steps and each step consists of a
Consider following batch jobs :
Spring Batch is a framework for batch processing – execution of a series of jobs. In Spring Batch, A job consists of many steps and each step consists of a
READ-PROCESS-WRITE
task or single operation
task (tasklet).- For “READ-PROCESS-WRITE” process, it means “read” data from the resources (csv, xml or database), “process” it and “write” it to other resources (csv, xml and database). For example, a step may read data from a CSV file, process it and write it into the database. Spring Batch provides many made Classes to read/write CSV, XML and database.
- For “single” operation task (tasklet), it means doing single task only, like clean up the resources after or before a step is started or completed.
- And the steps can be chained together to run as a job.
1 Job = Many Steps. 1 Step = 1 READ-PROCESS-WRITE or 1 Tasklet. Job = {Step 1 -> Step 2 -> Step 3} (Chained together)
Spring Batch Examples
Consider following batch jobs :
- Step 1 – Read CSV files from folder A, process, write it to folder B. “READ-PROCESS-WRITE”
- Step 2 – Read CSV files from folder B, process, write it to the database. “READ-PROCESS-WRITE”
- Step 3 – Delete the CSB files from folder B. “Tasklet”
- Step
4 – Read data from a database, process and generate statistic report in
XML format, write it to folder C. “READ-PROCESS-WRITE”
- Step 5 – Read the report and send it to manager email. “Tasklet”
In Spring Batch, we can declare like the following :
<job id="abcJob" xmlns="http://www.springframework.org/schema/batch">
<step id="step1" next="step2">
<tasklet>
<chunk reader="cvsItemReader" writer="cvsItemWriter"
processor="itemProcesser" commit-interval="1" />
</tasklet>
</step>
<step id="step2" next="step3">
<tasklet>
<chunk reader="cvsItemReader" writer="databaseItemWriter"
processor="itemProcesser" commit-interval="1" />
</tasklet>
</step>
<step id="step3" next="step4">
<tasklet ref="fileDeletingTasklet" />
</step>
<step id="step4" next="step5">
<tasklet>
<chunk reader="databaseItemReader" writer="xmlItemWriter"
processor="itemProcesser" commit-interval="1" />
</tasklet>
</step>
<step id="step5">
<tasklet ref="sendingEmailTasklet" />
</step>
</job>
The
entire jobs and steps execution are stored in database, which make the
failed step is able to restart at where it was failed, no need start
over the entire job.
<job id="abcJob" xmlns="http://www.springframework.org/schema/batch"> <step id="step1" next="step2"> <tasklet> <chunk reader="cvsItemReader" writer="cvsItemWriter" processor="itemProcesser" commit-interval="1" /> </tasklet> </step> <step id="step2" next="step3"> <tasklet> <chunk reader="cvsItemReader" writer="databaseItemWriter" processor="itemProcesser" commit-interval="1" /> </tasklet> </step> <step id="step3" next="step4"> <tasklet ref="fileDeletingTasklet" /> </step> <step id="step4" next="step5"> <tasklet> <chunk reader="databaseItemReader" writer="xmlItemWriter" processor="itemProcesser" commit-interval="1" /> </tasklet> </step> <step id="step5"> <tasklet ref="sendingEmailTasklet" /> </step> </job>