Introduction to Spring Batch
Now we tend to go more towards automated systems that are composed of several elements designed to perform a set of scheduled tasks. They…
Now we tend to go more towards automated systems that are composed of several elements designed to perform a set of scheduled tasks. They simplify, secure and make repetitive and operational tasks easier. These processes are complex and with large volumes of information without human interaction(read a file, process data, export data to a database, etc.).
Imagine our application receive every hour , from a file, a set of users informations from a file to process, generate a password for every user in the file and insert it in a database.
In my mind I thought to make a shell script or develop a program from scratch to do the process (recover, loop, read, process, transform, insert, …). But within this mode of treatment the risk is enormous. This may take longer, error handling is to be expected, manage rollbacks , not reliable because of the possibility of having an OutOfMemoryError, … I was not ready to reinvent the wheel( Don’t Repeat Yourself they said). Thus, i started looking for solutions in Internet and suddenly i come across a framework that allows to repeat a treatment on a set of large volumes of data without human intervention. It was Spring Batch and for me it was also the opportunity to test Spring and its ecosystem. Let’s go!
Spring batch ???
Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems. Spring Batch is not a scheduling framework. It provides reusable functions that are essential in processing large volumes of records, including logging/tracing, transaction management, job processing statistics, job restart, skip, and resource management.
Key Concepts
See the Batch Domain for more details.
- Job : A batch job composed of a sequence of one or more Steps.
- JobInstance :an instance of a Job configured with parameters (requested, for a specific date for example). A collection of JobExecutions.
- Step : An independent, discrete unit of work in a Job. A simple Step might load data from a file into the database, but may be more complex.
- JobExecution : An single attempt of a Job run (may be in progress, succeeded, or failed)
- StepExecution : A single attempt of a Step, associated with a JobExecution (may be in progress, succeeded or failed)
- JobRepository : Persistent storage for maintaining the Batch Domain model and associated state
- JobExecutionListener : Extension point for customizing JobExecution events.
- StepExecutionLister : Extension point for customizing StepExecution events.
- Remote Chunking, Partitioning* : Patterns for scalable distributed batch processing, see the docs for details.
Among others we can see as advantages
- Less coding
- More unit tests & integrations
- He takes care of the increase in load
- And many others ( see HERE)
Use case Spring batch
A batch program usually performs some number of recurring actions:
- Read a large number of records in a file or database,
- Sometimes do specific processing on the data (modification, call web service, etc …),
- Write / Insert processed data into a database or file.
Another example of use: The payment of salary in a company
1 — At the end of the month, the company must send a salary to the respective accounts of its employees.
2 — Do the processing of pay slips.
3 — Sending emails to staff for mass communication.
4 — Generate automated reports on a daily, weekly or monthly basis.
5 — Run the stream automatically without human intervention.
Technical Objectives
As a developer also we see a considerable contribution in our work with the use of spring batch.
- Batch developers use the Spring programming model: concentrate on business logic and let the framework take care of infrastructure.
- Clear separation of concerns between the infrastructure, the batch execution environment, and the batch application.
- Easy to configure, customize, and extend services, by leveraging the spring framework in all layers.
- All existing core services should be easy to replace or extend, without any impact to the infrastructure layer.
- Provide a simple deployment model, with the architecture JARs completely separate from the application
Spring Batch architecture
This layered architecture highlights three major high-level components: Application, Core, and Infrastructure.
Application
The application contains all batch jobs and custom code written by developers using Spring Batch.
Code
The Batch Core contains the core runtime classes necessary to launch and control a batch job. It includes implementations for JobLauncher, Job, and Step.
Infrastructure
This infrastructure contains common readers and writers and services (such as the RetryTemplate), which are used both by application developers(readers and writers, such as ItemReader and ItemWriter) and the core framework itself (retry, which is its own library).
Sample example
Here is a quick overview of how we plan to process our received files from .
1- At first, data is read from a file using the ItemReader that is part of Spring Batch.
2- The data is then passed to the processor (ItemProcessor) for processing the data according to the needs of the enterprise. We have to format the date to adapt it to the database.
3- The processed data that is now modified is transmitted in the database for writing (ItemWritter).
NB : the data source can be a database, file, queue, and so on.
Error management in spring batch
Spring batch also allows us to handle errors on processing
- Skip non-blocking errors (skip). Sometimes we do not need to stop stuff if we encounter some errors.
- Start a job again (retry). The database may be unavailable at a temporary moment. Sometimes we need to retry the process.
- Restart a batch (restart). After an error we are able to restart it.
Demo
For example i set up a small application that allows us to recover data from a file, do certain treatments before inserting it into a database. Using Spring Batch Flat file reader to read CSV file and Jdbc Batch Item Writer to write MySQL Database. After the treatment we can do other actions depending on the result of the job. If job success we can send a notification mail. If not we can retry to lunch the job or do something else.
INPUT
OUTPUT
See the repo for more details : https://github.com/tonux/file-processor
Baay Fall Takk Djok.