WAS reference: [DevWorks] Modernized Java-based batch processing in WebSphere Application Server, Part 2: Transaction batch programming model

Introduction
Part 1 of this article series described the Modern Batch feature of IBM WebSphere Application Server through the development of a sample batch application using the compute intensive programming model. Part 2 looks at the transaction batch programming model, which provides a powerful job failover model, based on checkpoint and restart semantics.
Figure 1 depicts the various components of a batch application.

Figure 1. Components of a Batch Application
Figure 1. Compute-intensive programming model

Figure 1. Compute-intensive programming model

You might notice that this diagram is an extension of Figure 1 from Part 1. This enhanced model introduces these new concepts:

Batch data stream (BDS): BDS provides an abstraction for the data stream processed by a batch step. The WebSphere Application Server Modern Batch feature provides a BDS framework which includes pre-built code that manages the opening, closing, externalizing, and internalizing of a checkpoint. Available BDS framework patterns are shown in Table 1. Table 1. Batch Data Stream Frameworks Patterns

BDS framework patterns	Description
JDBC	Retrieves/writes data from a database using a JDBC connection.
Byte	Reads/writes byte data from a file.
Text file	Reads/writes to a text file.
JPA	Retrieves/writes data to a database using a Java Persistence API (JPA) connection.

Checkpoint algorithm: The batch container calls the checkpoint algorithm periodically to determine if it is time to take a checkpoint. The Modern Batch feature provides two pre-built checkpoint algorithms, one that supports a time-based checkpoint interval, and another that supports a checkpoint interval based on record count. Custom checkpoint algorithms can also be plugged in by writing the implementation.
Result algorithm: Each batch job step supplies a return code when it is done. The results algorithm has visibility to the return codes from all steps in a batch job and returns a final, overall return code for the job as a whole. Modern Batch provides a pre-built results algorithm that returns the numerically highest step return code as the overall job return code. Custom result algorithms can also be plugged in by writing the implementation.

Transaction batch model
The controller bean, com.ibm.ws.batch.BatchJobControllerBean, controls the lifecycle of the batch application and is responsible for reading the xJCL file to find and execute these implementation classes:

Job Step implementation class: contains the business logic for each step.
Batch Data Stream implementation class: holds the data exchange logic.
Checkpoint implementation class: determines how often to commit global transactions under which batch steps are invoked.
Results implementation class: is used to manipulate the return codes of batch jobs.

With this high level understanding of the programming model, let’s develop a sample batch application that follows the transaction batch model using IBM Rational Application Developer V8.0.

Sample business scenario

Related products and versions

The example presented here was developed and tested using Rational Application Developer V8.0 and deployed on WebSphere Application Server V8.0. Tooling support for developing batch programming is also available with IBM Rational Software Architect for WebSphere V8 and later. Run time support for Modern Batch is available in WebSphere Application Server V7.0.0.11 and later with the Modern Batch feature pack, and is available in WebSphere Application Server V8 as an integrated component.

For this example, consider the case where you need a batch program that scans a file containing records. For each record, you need to perform some processing and finally insert the record into a database. The batch program should also allow for checkpoint and restart capability, which implies that in case the processing was somehow interrupted, further processing should resume from the last saved state and not from the beginning.
For example, say there are 100 records to be processed and the checkpoint is created after every 10 records. If the processing fails after 23 records. the batch program should insert 20 records in the database and the processing should continue from the 21st record.
The transaction batch programming model provides this functionality to create checkpoints after every specified record without any additional programming. In the case of an interruption after the 23rd record, the checkpoint and restart capability will ensure that the processing starts from 21st record. If you do not have the checkpoint and restart capability, it would mean that you need to either manually stream the job from the 21st record, or start afresh from the beginning.
In the next sections, you will develop a transaction batch job for this sample scenario using Rational Application Developer V8.0. Later, we’ll discuss the different interfaces that can be used to submit the WebSphere Application Server batch jobs.

Sample development
As you can see from Figure 1, developing a transactional batch job requires developing:

Configuration xJCL file
Job step implementation class
Batch data stream implementation class
Checkpoint algorithm implementation class
Result algorithm implementation class

Modern Batch provides built-in patterns for the various components that can be utilized for rapid development. In this example, you will be using these patterns:

Generic pattern as the job step pattern for implementing the job. This pattern can be used where you have exactly one input and one output stream.
Record based pattern as the checkpoint algorithm for specifying the number of iterations of the process job step method before committing the transaction.
Job sum pattern as the result algorithm for result verification. Job sum returns the highest return code of all job steps.
Text file reader pattern as the input stream as you are reading the input data from the file, and JDBC writer pattern as the output stream for committing the records to the database as part of the BDS framework.

See Resources to learn more about the various available built-in patterns.
To develop this application:

Create the project. Start by first creating the required projects in Rational Application Developer. Navigate to File > New > Batch Project and create a new batch project named TransactionBatch. Click Finish. This will generate the required three projects.
Create Job Control file. Create a new xJCL file named TransferRecordsToDB by right-clicking on the xJCL folder under TransactionBatch project, and select New > Batch Job, as shown in the Figure 2. Click Next.

Figure 2. Creating a new batch job
Create batch job step. Here, you create the batch job step with its implementation class. Name the job InsertRecords and select Generic as the Pattern. Click Create in the Required Properties Section of the dialog (Figure 3).

Figure 3. Create job step
Create implementation class. Name the class InsertRecordsToDB and name the package com.ibm.dw.batch.transaction (Figure 4). Click Finish to create the generic pattern implementation class.

Figure 4. Create pattern implementation class
Add checkpoint algorithm. In the Algorithm section of the Batch Step Creation dialog (Figure 3), click the Add button next to Checkpoint Algorithm. The Checkpoint Algorithm dialog will display (Figure 5). Name the class RecordCountCheck. Select Record Based as the Pattern, and enter other field values shown in Figure 5. Click Finish to the checkpoint algorithm implementation based on the Record Based pattern. This would ensure that the checkpoints are taken every 10 records.

Figure 5. Create checkpoint algorithm implementation class
Add result algorithm. In the Algorithm section of the Batch Step Creation dialog (Figure 3), click the Add button next to Result Algorithm. The Result Algorithm dialog will display (Figure 6). Name the class JobSumResult and select Job Sum for the Pattern. Click Finish to create the result algorithm implementation based on the Job Sum pattern.

Figure 6. Creating Result Algorithm Implementation Class

When all the required implementation classes have been create, the Batch Step Creation dialog should look as it does in Figure 7. Click on the Next button.

Figure 7. Batch Step Creation completion
Specify input stream. On the Step Stream dialog (Figure 8), name the Input Streams as TextFileInputStream. Select Text File Reader as the Pattern and provide the location of the input file (in this example, C:\\InputFile.txt) for the FILENAME under the Required Properties section. Click Create.

Figure 8. Create text file reader input stream
Create implementation class for Input stream. The Create class dialog for creating the implementation class for the Text File Reader Pattern for the input stream displays. Name the class TextFileReader and name the package com.ibm.dw.batch.transaction (Figure 9). Click Finish and then Next.

Figure 9. Create text file reader implementation class
Specify output stream. Name the Output Stream JDBCOutputStream and select the JDBC Writer for the Pattern. Click Create (Figure 10).

Figure 10. Create JDBC writer output stream
Create implementation for output stream. The Create Class dialog displays for creating the implementation class for the output stream JDBC Writer. Name the class JDBCWriter and name the Package com.ibm.dw.batch.transaction (Figure 11). Click Finish twice to complete all the required steps of the transaction batch creation process.

Figure 11. Create JDBC writer implementation class

After completing the batch job creation steps, the xJCL Editor for the TransferRecordsToDB should look like that shown in Figure 12.

Figure 12. Project structure
Replace the implementation classes with the ones included with this article for download, completing the exercise of implementing the sample transaction batch application.

You now have a transaction batch program created using Rational tooling. The sections that follow address other configuration elements required and various interfaces available for running this sample.

Submitting batch jobs
The WebSphere Application Server Modern Batch feature provides several interfaces for submitting jobs:

Job Management Console, discussed in Part 1.

EJB API is used for an enterprise setting using a Java EE container approach, which is beyond the scope of this article, which addresses a standalone client approach. To submit the job via the standalone EJB interface, see Listing 1. Listing 1

//Obtain naming context
Hashtable env = new Hashtable();

env.put(Context.INITIAL_CONTEXT_FACTORY, 
"com.ibm.websphere.naming.WsnInitialContextFactory");

env.put(Context.PROVIDER_URL,"corbaloc:iiop:” + <HOST_NAME> + “:” + <BOOTSTRAP_PORT> 
+/NameServiceCellRoot");

InitialContext ctxt = new InitialContext(env);

//Lookup the EJB 
JobSchedulerHome jobSchedulerHome = (JobSchedulerHome) ctxt.lookup("nodes/" + 
<NODE_NAME> + "/servers/" + <SERVER_NAME> + “/ejb/com/ibm/websphere/longrun/
JobSchedulerHome");

//Create the Job Scheduler
JobScheduler jobScheduler = jobSchedulerHome.create();

//Read the xJCL file
jobScheduler.submitJob( <xJCL_FILE_CONTENT> );

You can also refer to the sample EJB batch client project included with this article for download.

Web service interface http://<HOST_NAME>:<DEFAULT_HOST_PORT> /LongRunningJobSchedulerWebSvcRouter/services/JobScheduler. The client can be generated for the WSDL using Rational tooling and the jobs can be submitted using the submitJob( ) method.
Command line utility lrcmd is available in the <WAS_INSTALLATION_DIR>\bin directory.

Running the sample
To run the sample from Rational Application Developer:

Configure Before running the sample application, place the InputFile.txt file in a folder (for example, <PROFILE_HOME>) and update the input location accordingly in the xJCL file. You need tables to insert the records after processing. Use TableSetup.sql to create the required tables in a Derby database; you might need to make changes if you are using another database. Also, create the JNDI (jdbc/ds_jndi) in the WebSphere Application Server administrative console to point to the database where the records will be inserted. When you have completed the configuration, deploy the TransactionBatchEAR to the server and submit the job using the interface above of your choice.
Verify the checkpoint and restart capability You’ll notice that the submitted job won’t complete successfully on the first run, having throw an exception as shown in Figure 13. If you investigate further, you’ll find the InputFile.txt has an empty newline at the 23rd line causing it to fail, resulting in the job moving from Executing to Restartable state. You’ll see that the job successfully processed the first 20 records (that is, committed two iterations of 10 records each, as per your job configuration) to the database and rolled back the insertion of 21st and 22nd records, since the exception occurs while processing the empty line in the third iteration (Figure 13).

Figure 13. Running the sample application

If you restart the job after removing the empty line from the InputFile.txt file, it would complete processing the remaining records, beginning with the 21st record, as shown in Figure 14.

Figure 14. Restarting the job
Integrate with schedulers For illustrative purposes, these articles have explained the batch programming models with its client interfaces, but a batch application would generally be triggered by an enterprise scheduler such as IBM Tivoli® Workload Scheduler, at some pre-determined time. The EJB or the web services interface can be used to integrate Tivoli Workload Scheduler with the batch application. If you use another scheduler that runs like a cron job, then you’ll be able to use the lrcmd utility.

Conclusion
The Modern Batch feature of WebSphere Application Server provides a robust batch programming model that enables you to develop batch programs with minimum effort. Because Modern Batch is a part of WebSphere Application Server, reliability is built into the solution.
This article explained the transaction batch programming model and completed a sample application using the same, finishing up our discussion of the different batch programming models. Subsequent articles will look at more advanced features of Modern Batch and show how it can be used in an enterprise setting.

Acknowledgements
The authors thank Sajan Sankaran and Edward McCarthy for reviewing this article and providing invaluable input.

Download

Description	Name	Size	Download method
Code sample	1205_narain_attachment.zip	5 KB	HTTP

Information about download methods

WAS reference

2013년 7월 18일 목요일

[DevWorks] Modernized Java-based batch processing in WebSphere Application Server, Part 2: Transaction batch programming model

Related products and versions

댓글 없음:

댓글 쓰기

referance site

가장 많이 본 글