How I reduced my Java batch application’s code by 80% using Easy Batch!

In this post, I will show how can Easy Batch tremendously simplify your batch applications by reducing plumbing code considerably. This will make your applications code more readable, understandable and maintainable!

The use case is a typical production application that loads data from an input CSV flat file to a database.

Consider the following CSV input file containing products data:

Could not embed GitHub Gist 8547399: Not Found

Suppose we have a JPA EntityManager that will be used to persist Product objects to the database. We would like to map each record of this file to an instance of the following Product POJO :

Could not embed GitHub Gist 8548282: Not Found

Before persisting products to the database, we should validate data to ensure that:

  • product id and name are specified
  • product price is not negative
  • product last update date is in the past

Finally, we should skip records starting with # from being processed, mainly the header record (there could be other records starting with # like a trailer record that marks the end of data in the file).

To keep the example simple, we will write products data to the standard output and not to a database.

So let’s get started!

The following listing is a possible solution that I have seen a handred times in production systems:

Could not embed GitHub Gist 8547412: Not Found

This solution actually works perfectly and implements the requirements above. But it is evident that it is a maintainance nightmare! It could be worse if the Product POJO contained dozen of fields, which is often the case in production.

In this 95 lines solution, there is only one line which represents the batch business logic. Did you guessed it?

It is line 78 : System.out.println(“product = ” + product);// or in production, it would be persisting the object to the database

ALL the rest is plumbing : reading, filtering, parsing and validating data, mapping records to Product instances and reporting some statistics at the end of execution.

This is where Easy Batch comes to play to handle ALL of this plumbing for you! With Easy Batch, you concentrate only on your batch business logic. So let’s see how would be the solution with Easy Batch.

First, we will create a RecordProcessor that will implement our batch business logic :

Could not embed GitHub Gist 8547436: Not Found

Then, we will DECLARE (not implement like in the above solution!) data validation constraints on our Product POJO with the elegant Bean Validation API as follows:

Could not embed GitHub Gist 8544434: Not Found

Finally, we should just configure Easy Batch to:

  • Read data from the flat file products.csv
  • Filter records starting with #
  • Map each CSV record to an instance of the Product POJO
  • Validate products data
  • and process each record using our ProductProcessor implementation

This can be done with the following snippet:

Could not embed GitHub Gist 8547426: Not Found

That’s all! Except implementing the core batch business logic, all we have done is providing configuration metadata that Easy Batch cannot guess! The framework will handle all the plumbing of reading, filtering, parsing, validation and mapping data to the domain object Product.

Now let’s do the count of lines of code. Both solutions use the Product POJO and their respective Main classes have 6 lines of imports, so we will not take into account these imports neither the number of line of code of the Product POJO.

  • The first solution WithoutEasyBatchLauncher has 82 lines of code (empty lines have been ignored). Note that I have inlined all variables in this solution and tried to make it as compact as possible, this is for those who will think I had delibarately maximized the number of lines of code of this solution 🙂
    • For the second solution we have:
    1. 5 lines for the ProductProcessor class (empty lines ignored)
    2. 4 lines for Bean Validation API annotations we added on the Product POJO
    3. and 9 lines for the Main class WithEasyBatchLauncher

    In sum, 82LOC vs 18LOC, which is 78% less than the first solution!

    I am pretty sure there are some readers who will say, but it is not 80% like in the title of the post 😉

    Actually I had to put 90% or even 95% because this is not all what Easy Batch have done for us, in addition to all what we have seen above, Easy Batch

  • provides a more detailed report about processing time for each record, the average record processing time and a percentage of filtered, ignored, rejected and processed records
    • allows to monitor the batch execution with JMX at runtime

    So, at the end of this post, this is what Easy Batch is all about, making your life easier when you have to deal with batch applications in Java by letting you concentrate on your batch business logic and handling ALL the rest for you.

    Even though the sample I used in this post is about processing a flat file, Easy Batch can also handle the plumbing of processing data from a database or an xml file, which can be more complex than the plumbing code of processing data from a flat file.

    You can check out Easy Batch tutorials here!

    Published at Codingpedia.org with permission of Mahmoud Ben Hassine – source  “How I reduced my Java batch application’s code by 80% using Easy Batch!” from http://www.mahmoudbenhassine.com/

    Mahmoud Ben Hassine

    Mahmoud Ben Hassine

    Passionate Software Engineer who loves to learn and teach. As an open source advocate, I love to create and contribute to open source projects and to share my experience with other developers around the globe. If you are a chess Junkie like me, it will be a pleasure to challenge you on a chess board!

    How to redirect domain to www url with nginx

    Snippet from nginx config file that redirects all requests (http and https) to the www URL Continue reading