2012년 7월 16일 월요일

[DevWorks] Modernized Java-based batch processing in WebSphere Application Server, Part 2: Transaction batch programming model


Modernized Java-based batch processing in WebSphere Application Server, Part 2: Transaction batch programming model


Introduction
Part 1 of this article series described the Modern Batch feature of IBM WebSphere Application Server through the development of a sample batch application using the compute intensive programming model. Part 2 looks at the transaction batch programming model, which provides a powerful job failover model, based on checkpoint and restart semantics.
Figure 1 depicts the various components of a batch application.

Figure 1. Components of a Batch Application
Figure 1. Compute-intensive programming model
You might notice that this diagram is an extension of Figure 1 from Part 1. This enhanced model introduces these new concepts:
  • Batch data stream (BDS): BDS provides an abstraction for the data stream processed by a batch step. The WebSphere Application Server Modern Batch feature provides a BDS framework which includes pre-built code that manages the opening, closing, externalizing, and internalizing of a checkpoint. Available BDS framework patterns are shown in Table 1. Table 1. Batch Data Stream Frameworks Patterns

    BDS framework patternsDescription
    JDBCRetrieves/writes data from a database using a JDBC connection.
    ByteReads/writes byte data from a file.
    Text fileReads/writes to a text file.
    JPARetrieves/writes data to a database using a Java Persistence API (JPA) connection.

  • Checkpoint algorithm: The batch container calls the checkpoint algorithm periodically to determine if it is time to take a checkpoint. The Modern Batch feature provides two pre-built checkpoint algorithms, one that supports a time-based checkpoint interval, and another that supports a checkpoint interval based on record count. Custom checkpoint algorithms can also be plugged in by writing the implementation.
  • Result algorithm: Each batch job step supplies a return code when it is done. The results algorithm has visibility to the return codes from all steps in a batch job and returns a final, overall return code for the job as a whole. Modern Batch provides a pre-built results algorithm that returns the numerically highest step return code as the overall job return code. Custom result algorithms can also be plugged in by writing the implementation.
Transaction batch model
The controller bean, com.ibm.ws.batch.BatchJobControllerBean, controls the lifecycle of the batch application and is responsible for reading the xJCL file to find and execute these implementation classes:
  • Job Step implementation class: contains the business logic for each step.
  • Batch Data Stream implementation class: holds the data exchange logic.
  • Checkpoint implementation class: determines how often to commit global transactions under which batch steps are invoked.
  • Results implementation class: is used to manipulate the return codes of batch jobs.
With this high level understanding of the programming model, let’s develop a sample batch application that follows the transaction batch model using IBM Rational Application Developer V8.0.

Sample business scenario

Related products and versions

The example presented here was developed and tested using Rational Application Developer V8.0 and deployed on WebSphere Application Server V8.0. Tooling support for developing batch programming is also available with IBM Rational Software Architect for WebSphere V8 and later. Run time support for Modern Batch is available in WebSphere Application Server V7.0.0.11 and later with the Modern Batch feature pack, and is available in WebSphere Application Server V8 as an integrated component.
For this example, consider the case where you need a batch program that scans a file containing records. For each record, you need to perform some processing and finally insert the record into a database. The batch program should also allow for checkpoint and restart capability, which implies that in case the processing was somehow interrupted, further processing should resume from the last saved state and not from the beginning.
For example, say there are 100 records to be processed and the checkpoint is created after every 10 records. If the processing fails after 23 records. the batch program should insert 20 records in the database and the processing should continue from the 21st record.
The transaction batch programming model provides this functionality to create checkpoints after every specified record without any additional programming. In the case of an interruption after the 23rd record, the checkpoint and restart capability will ensure that the processing starts from 21st record. If you do not have the checkpoint and restart capability, it would mean that you need to either manually stream the job from the 21st record, or start afresh from the beginning.
In the next sections, you will develop a transaction batch job for this sample scenario using Rational Application Developer V8.0. Later, we’ll discuss the different interfaces that can be used to submit the WebSphere Application Server batch jobs.


..................... skip ......................

댓글 없음:

댓글 쓰기