10 minute read time

Understanding JVM COBOL Performance - It is Fast - Very Fast!

by in Application Modernization

Both Native And JVM COBOL Are Amazingly Fast For Batch Processing. Here is a comparison.


JVM code can be just as fast as the equivalent native code. This might come as a shock to some people. Yes - JVM code (like Java or COBOL) can the same speed as native code (like C or COBOL).

Scenario:
This is generally true. However, in this case I am looking at batch processing, a type of programming which is very common in commercial settings. In this programming model many discrete data processing programs are run in succession, traditionally controlled by a script or some sort. On the mainframe that script is usually JCL (Job Control Language) on distributed computers it is either shell or cmd (.bat) scripts.

The nature of traditional batch runs on distributed (non-mainframe) hardware means that lots of short run processes are utilised. This is exactly what JVM programs are bad at. However, with good batch system architecture this can be overcome with amazing results.

To demonstrate just how fast JVM programs can be requires overcoming a number of challenges:

The Challenges:
The first challenge in proving something like this has been the lack of directly comparable languages. Compiling Java to native works with gcj; however, even though the result is often slower than running the same Java byte code on the JVM, this proves little because the gcj compiler is not a super trusted, highly optimized commercial grade compile. No offence, it has never been developed that way.

As a senior principal developer in the JVM COBOL team at Micro Focus, I am in an almost unique position to compare our battle hardened native COBOL compiler with our soon to be general availability JVM COBOL compiler. The compiler front ends are the same! The only difference is the code generators. The native compiler has an extremely effective optimising native code (machine code) generator and the JVM compiler produces JVM Byte Code in class file format directly.

Note: the Micro Focus JVM COBOL compiler generates Byte Code directly, it does not go through an intermediate step.


The second challenge comes from the way JVM code runs. I explained this in detail in  Tuning The JVM For Unusual Uses - Have Some Tricks Under Your Hat. Which explains the interpretation/profiling, compilation and native phases of JVM execution.

JVMs are designed to function very fast for long running processes. The longer they run the faster they get.

Because of this we need a way to run COBOL programs over and over again, or many separate COBOL programs, using the same JVM process.

The third challenge is how to we recreate all the advantages of a running batch files but in a way which permits very efficient use of JVM based COBOL?

The Approach:

Javascript comes to the rescue. Javascript is an amazingly simple yet powerful programming language. There is a pure JVM implementation of it called Rhino. Rhino is open source and is developed by the Firefox people - Mozilla.

You can do amazing things with Rhino and JVM COBOL. I will be writing an entire post on this in the near future. However, the key thing is that we can call JVM COBOL programs directly from scripts. For example, to run program cobol_2 (see source at bottom of this post) all that is required in javascript is Packages.cobol.cobol_2.main(null); yes it is that simple! This line looks for the program in a JVM namespace (Java people call these things packages). To put it there all I needed to do was compile it using the Micro Focus compiler like this:

cobol cobol_2.cbl jvmgen noanim ilnamespace(cobol);


I am able to compile exactly the same code to native by doing this:

cobol cobol_2.cbl opt;
cbllink cobol_2


By so doing I have created identical JVM and native COBOL programs. To execute the native version from javascript requires  (runtime.exec("cobol_2")).waitFor(); for more details, please look at the javascript source code below.

The Power Of Javascript
The power of the javascript approach comes from the way the script does not need re-compiling when it is changed. Wrapping compiled programs up in an interpreted scripting language is a very rapid way of developing. I first started doing things this way when I wrapped up FORTRAN quantum mechanical code in a scripting language called TCL.

Javascript is ubiquitous (you are probably running some right now in this web page), object oriented, fast, easy to use and powerful. It makes a great choice for the batch control logic running large operations in a high performance language like COBOL.

This project was very much quicker and to do in javascript than it would have been in any compiled language. You can think of this as 'lego' programming. The big strong COBOL blogs can be arranged in any order by the javascript to make many different and useful systems without evern touching the COBOL.


The key is that javascript can run a JVM COBOL program in the same JVM as the script. This is because JVM COBOL programs are fully JVM compliant just like Java classes.

Making The Test Realistic

Please take a look at cobol_2 and cobol_3 below. The first is a pure mathematical processing program. The second created a file of 10,000 indexed records and then reads them in by index. This latter program really highlights why COBOL batch processing is still so popular and essential. To create and read by index 10,000 records even in a powerful relational database takes a bit of time. In COBOL it takes a couple of seconds on a laptop!

One difference in the approach between the execution of JVM COBOL and native is the process launch overhead; this being the time taken for the operating system to launch the native COBOL processes. To measure this the benchmarking javascript was run with a COBOL program which consisted of just a single goback statment.

I performed the tests using the in development code for Visual COBOL 1.4. The release code in the general availability version later this year should be very similar in performance. I ran the code in 32 bit mode on a Dell E6400 laptop with Windows 7 Enterprise 64bit installed.

The Results: 

To ensure fairness, I ran the test three times. Each test performed 32 runs of both programs in JVM COBOL and 32 runs of both programs in native COBOL. The time for each execution of both programs was measured in milliseconds using the javascript Date object. The maximum, mean, minimum and total execution times were recorded and reported.

Run 1: 
In this run the JVM approach is slightly faster over all then the native.

Results:
=========
JVM
    Maximum Time: 1300
    Minimum Time: 381
    Mean    Time: 624.3125
    Total   Time: 19978
Native
    Maximum Time: 1404
    Minimum Time: 525
    Mean    Time: 661.1875
    Total   Time: 21158


Run 2: 
Again, in this run the JVM approach is slightly faster over all then the native.

Results:
=========
JVM
    Maximum Time: 1283
    Minimum Time: 385
    Mean    Time: 670.53125
    Total   Time: 21457
Native
    Maximum Time: 1559
    Minimum Time: 526
    Mean    Time: 705.03125
    Total   Time: 22561


Run 3: 
Here we see the native approach just pipping the JVM one. Really, there is nothing to choose between them.

Results:
=========
JVM
    Maximum Time: 1320
    Minimum Time: 390
    Mean    Time: 734.25
    Total   Time: 23496
Native
    Maximum Time: 1782
    Minimum Time: 528
    Mean    Time: 698.75
    Total   Time: 22360


CPU Bound:
I did one run were the file handling program was not called. This means that the only code running was the mathematical code. The results here were shockingly in favour of the JVM approach. If I had made a choice between native and JVM based on the first iteration, I would have thought native was faster. Based on the group of 32 iterations JVM proves to be twice as fast native. However, this is also misleading if we include process launch overhead.

Results:
=========
JVM
    Maximum Time: 354
    Minimum Time: 59
    Mean    Time: 101.09375
    Total   Time: 3235
Native
    Maximum Time: 272
    Minimum Time: 169
    Mean    Time: 203.53125
    Total   Time: 6513

First iteration:
JVM   = 354
Native= 272


Process Launch Overhead:

Results:
=========
JVM
    Maximum Time: 313
    Minimum Time: 0
    Mean    Time: 9.84375
    Total   Time: 315
Native
    Maximum Time: 72
    Minimum Time: 54
    Mean    Time: 57.84375
    Total   Time: 1851


The process launch overhead is around 1.5 seconds over a 32 process launch cycle. This is insufficient to qualitative change any of the results above. For example, if we take the 1.5 seconds off the 6.5 second time for the CPU bound test JVM COBOL is still 1.8 (55%) faster than native.

The Conclusions:

1) Running a small program in JVM and native COBOL in no way acts as a benchmark for real world performance.

2) Running JVM COBOL from javascript is a jaw droppingly fast and easy way to implement batch processing.

3) JVM Managed COBOL has comparable performance in typical batch applications to native Micro Focus COBOL on 32 bit Intel architecture. (other platforms not tested).

4) Process launch overhead is significant. This approach is better than traditional batch processing because it overcomes the process launch overhead. However, even when factoring out process launch overhead, JVM COBOL performance is still no slower than native across the ranges of tests performed here.

The Appendix:

Javascript

function bench()
{
    this.min     = 10000000;
    this.max     = 0;
    this.count   = 0;
    this.current = 0;
    this.total   = 0;
    this.update  = function()
    {
        if(this.current > this.max)
        {
            this.max = this.current;
        }
        
        if(this.current < this.min)
        {
            this.min = this.current;
        }
        
          this.count;
        this.total  = this.current;
    }

    this.mean    = function()
    {
        return this.total / this.count;
    }    
    
    this.display  = function()
    {
        display("    Maximum Time: "   this.max);
        display("    Minimum Time: "   this.min);
        display("    Mean    Time: "   this.mean());
        display("    Total   Time: "   this.total);
    }    
}

var jvm = new bench();
var nat = new bench();
var its = 32;
var runtime=java.lang.Runtime.getRuntime();

display("JVM    COBOL Benchmark");
for(var i=0;i<its;  i)
{
    var start = (new Date()).getTime();
    Packages.cobol.cobol_2.main(null);
    Packages.cobol.cobol_4.main(null);
    jvm.current = ((new Date()).getTime())-start;
    display(""   jvm.current);
    jvm.update();
}

display("Native COBOL Benchmark");
for(var i=0;i<its;  i)
{
    var start = (new Date()).getTime();
    (runtime.exec("cobol_2")).waitFor();
    (runtime.exec("cobol_3")).waitFor();
    nat.current = ((new Date()).getTime())-start;
    display(""   nat.current);
    nat.update();
}

display("");
display("Results: ");
display("=========");

display("JVM");
jvm.display();
display("Native");
nat.display();


function display(what)
{
    java.lang.System.out.println(what);
}


cobol_2.cbl 

123456$set sourceformat(variable)
 
       01 my-group.
           03 counter pic s9(9) comp-5.
           03 a       pic s9(9) comp-5.
           03 b       pic s9(9) comp-5.
           03 r       pic s9(9) comp-5.
 
       move 123456789 to a b r
       perform varying counter from 1 by 1 until counter = 1000000
            compute r = (a   b) / (a - b)
            compute r = (r   b) / (a - b)
            compute r = (r   b) / (a - b)
            compute r = (r   b) / (a - b)
            compute r = (r   b) / (a - b)
       end-perform
     
       .

 cobol_3.cbl 

123456$set sourceformat(variable)
        input-output section.
        file-control.
            select source-file
            assign to disk "count.idx"
            organization indexed
            access dynamic
            record key is r-key
            status is source-status.
             
        data division.
        file section.
            fd  source-file.
            01  source-record.
            03  raw-line  pic x(256).
            03  source-line redefines raw-line.
                05 filler      pic x(7).
                05 r-key       pic x(10).
                05 source-body pic x(249).
                
        working-storage section.
            01 counter binary-long.
            01 check   binary-long.
            01 source-status pic 99.
        
        procedure division.
            open output source-file
            perform varying counter from 1 by 1 until counter = 10000
                move counter to source-body r-key
                write source-record
            end-perform
            close source-file
            open input source-file
            perform varying counter from 1 by 1 until counter = 10000
                move counter to r-key
                read source-file
                move source-body(1:9) to check
                if check not = counter
                    display "woops"
                end-if
            end-perform
            close source-file



Launching Javascript
To launch javascript easily I created a batch file with this line in it: 

 "\Program Files (x86)\Java\jdk1.6.0_21\bin\java" -server org.mozilla.javascript.tools.shell.Main %1 %2 %3 %4 %5

A Final Note:

I have tried very hard to compare like with like in this post. In the mathematical COBOL I have compared the use of comp-5 group items. For native and JVM COBOL this means that for each calculation the program should load and store the calculated value from a block of memory allocated to working storage. If the COBOL is changed like this (remove the my-group label)...

       01.
           03 counter pic s9(9) comp-5.
           03 a       pic s9(9) comp-5.
           03 b       pic s9(9) comp-5.
           03 r       pic s9(9) comp-5.


... the compiler is able to treat the working storage items in a much more efficient way in JVM COBOL. The result for the calculation done this way is:

Results:
=========
JVM
    Maximum Time: 342
    Minimum Time: 0
    Mean    Time: 11
    Total   Time: 352
Native
    Maximum Time: 223
    Minimum Time: 183
    Mean    Time: 195.09375
    Total   Time: 6243


Yes, here JVM COBOL is running nearly 20 times faster than native. However, this is a somewhat artificial stiuation and so I am not including in the results used for this post. Again, factoring in the process launch overhead only reduces this difference qualitatively to 13.5 times; JVM COBOL remains very much faster.

 

Labels:

Parents
  • This is a brilliant question and deserves a much better answer than I can stuff in this comment box - so I have written a new post to cover it.

    community.microfocus.com/.../80_Comparing_Running_JVM_COBOL_From_Javascript_With_Pure_Native_COBOL

    Please let me know what you think :)

Comment
  • This is a brilliant question and deserves a much better answer than I can stuff in this comment box - so I have written a new post to cover it.

    community.microfocus.com/.../80_Comparing_Running_JVM_COBOL_From_Javascript_With_Pure_Native_COBOL

    Please let me know what you think :)

Children
No Data