Intro
Despite the fact that SOA and real-time integration have evolved, many interfaces are still flat file-based and therefore best processed through a batch mode. Nevertheless, there is no de facto or industry-standard approach to Java-based batch architectures. Batch processing seems to be a critical, missing architectural style and capability in the marketplace.
Spring Batch was started with the goal of addressing this missing capability by creating an open source project that can be offered as a hopeful batch processing standard within the Java community. One of the things which sets Spring Batch apart from other projects in the Spring Portfolio is that Spring Batch is the result of a partnership between SpringSource and Accenture.
Spring Batch is a framework which manages batch and offline processing concerns so that an application developer can focus on business logic. As new ideas, Spring Batch brings to the batch processing space the ability to write lightweight application code that can be tested in isolation, and a powerful framework to execute, manage and monitor the results of offline processing.
Reasons of development
Main reasons why
Spring Batch was developed:
- the missing enterprise capability in the market -> the lack of standard architecture often leads to a custom architecture which also leads to significant development and maintenance costs in the future.
- existing solutions are to close to the application server (i.e. using an appropriate EJB/J2EE approach).
Features & Concerns
The framework provides extensive bulk processing, time-based events, periodic application of complex business rules and support for reusable functions such as: logging/tracing, transaction management, job processing statistics, job restart or skip, resource management etc.
Regarding the framework concerns, the following can be mentioned:
- concurrent batch processing (that is parallel processing of jobs)
- manual/scheduled restart after failure (this relates to transient failure, patterns of robustness, restarting a job etc.)
- partial processing (skipping certain records, mostly when considering rollback)
- sequential processing of dependent steps (batch jobs dependent on each other, sequences especially in set-up wizards/processes)
- staged, enterprise message-driven processing (i.e. help in optimization when combining it with Spring Integration for support in asynchronous processing or when widening the transaction is needed)
- whole batch transaction for simple data models and small batch size
- massively batch processing
- scalability scenario: business logic can be deployed in different ways declaratively (because this is not a developer concern, but a framework concern) -> application code changes are not required to use different deployment approaches
Framework Architecture
The Spring Batch architecture is layered to provide a great deal of freedom to application architectures, as well as to provide batch execution environments.

Spring Batch provides an Infrastructure layer in the form of low level tools. There is also a simple execution environment, using the infrastructure in its implementation. The execution environment provides robust features for traceability and management of the batch lifecycle.
The Batch Application Style is organized into four logical tiers, which include Run, Job, Application, and Data. The primary goal for organizing an application according to the tiers is to embed what is known as "separation of concerns" within the system. These tiers can be conceptual but may prove effective in mapping the deployment of the artifacts onto physical components like Java runtimes and integration with data sources and targets. Effective separation of concerns results in reducing the impact of change to the system. 
As a consequence, the conceptual tiers are:
- Run tier - responsible for scheduling and launching the application
- used in conjunction with a scheduling product to allow time-based and interdependent scheduling of batch jobs as well as providing parallel processing capabilities
- Job tier - responsible for the overall execution of a batch job
- Application tier - contains specific tasks that address required batch functionality and enforces policies around execution (e.g., commit intervals, capture of statistics, etc.)
- Data tier - provides integration with the physical data sources that might include databases, files, or queues
As seen above, the stereotypes conceptual relationship can be better depicted with the following:

In order to better understand the specified concepts, let us suppose the case of a nightly synchronization of data received in a certain system. We will have a batch job that should be run at midnight: the 'MidnightSync' job. There is one such job, but each individual run of the Job must be tracked separately.
Job = 'MidnightSync' job
JobInstance = the 'MidnightSync' job for 2008/11/06
JobExecution = the first attempt at 'MidnightSync' job for 2008/11/06
JobParameters = i.e. 06-11-2008
Step = i.e. 'loadXMLData'
StepExecution = a single attempt to execute the 'loadXMLData' Step
Summary
Spring Batch is a new implementation of some very old ideas. It brings one of the oldest programming models in IT organizations into the mainstream through the use of Spring, a popular open source framework. It enables a batch project to enjoy the same clean architecture and lightweight programming model as any Spring project, supported by industry-proven patterns, operations, templates, callbacks and other idioms. Spring Batch is an exciting initiative that offers the potential of standardizing Batch Architectures within the growing Spring Community and beyond.
References
For further details, you may consult the resources at the following location:
Spring Batch Documentation