This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

RM/COBOL File [ISAM] Performance issue

I am trying to compare 2 ISAM [RM/COBOL] files as below :

a. 2 version of same RM/COBOL [ISAM] file. [Say file1 and file2]

b. Trying to compare using Cobol program.

c. Each file around 41M records. and record length is about 500 Bytes [FB]

d. Compare program is taking about 1 hours and 40 mins.

e. Recovered each file then try to compile files. Still around  same result.

f. I copied file1 to file1new and file2 to file2new [RM/COBOL file - ISAM]. 

g. Tried to compare file1new and file2 new. Compare program took 7 mins.

Is it possible to rebuilt file1 and file2 so that 7 mins performance could be obtained ? [ Since I cannot file in Prod]

How to identify file problem [so that check our entire  file repository for such issue] ? 

Please excuse me if this topic is already discussed previously. [I tired but not able to find].

Parents
  • 0

    I copied file1 to file1new and file2 to file2new [RM/COBOL file - ISAM]. 

    Did you do this with an RM/COBOL program?  If yes, then the rest of this response applies.  If not, please ignore.

    The performance improvement is the result of writing the records into the new files in the same order as you are reading them in the compare program (presumably by the prime key).  There are two beneficial effects:

    • In the new files, the prime key's b-tree will be optimally created to fill all the disk blocks with b-tree nodes.  In the original file, which probably was not created in prime key order, the tree will not be as optimally created.  This means that more I/O will be required to traverse the prime key's tree in the original file.
    • The data records will be clustered into blocks in the order that your compare program is going to read them.  Therefore the file manager cache will be highly efficient.  Again, less I/O to the OS (which probably also has a cache) or to the actual device.

    Is it possible to rebuilt file1 and file2 so that 7 mins performance could be obtained ?

    It is possible to obtain some of the efficiencies described above.  The recovery utility (also called recover1) will sort all the B-trees as it reads through the file's data blocks.  You can use the M option to increase the memory size used by the sort algorithm, to further optimize the B-tree(s).  This achieves the efficiency described in the first point above (and for all keys, not just the prime key).

    If you specify the T option, recovery will move data blocks from the near the physical end-of-file to available blocks nearer the beginning of the file.  The end-of-file is then moved to return unused blocks at the end to the operating system.

    recovery does not reorder data blocks to obtain the cache efficiency of the second point.  If reading through a large file in prime key order is a significant requirement, then you must use a COBOL program to achieve the data block caching efficiency.

    My expectation is that recovery will improve your compare program performance, but not to the extent that you see after a bespoke COBOL program does the copy.

    EDIT:  I forgot to mention one thing that probably does not matter to most folks.  The COBOL standard requires, for reading on alternate keys that have duplicates, that multiple records containing the same key value be returned in temporal order, oldest first.  The bespoke COBOL program will not preserve the original temporal order, unless the prime key happens to order records in temporal order (such as date-time or a sequence number being the leftmost characters in the prime key).  recovery does preserve the order for alternate keys with duplicates.

Reply
  • 0

    I copied file1 to file1new and file2 to file2new [RM/COBOL file - ISAM]. 

    Did you do this with an RM/COBOL program?  If yes, then the rest of this response applies.  If not, please ignore.

    The performance improvement is the result of writing the records into the new files in the same order as you are reading them in the compare program (presumably by the prime key).  There are two beneficial effects:

    • In the new files, the prime key's b-tree will be optimally created to fill all the disk blocks with b-tree nodes.  In the original file, which probably was not created in prime key order, the tree will not be as optimally created.  This means that more I/O will be required to traverse the prime key's tree in the original file.
    • The data records will be clustered into blocks in the order that your compare program is going to read them.  Therefore the file manager cache will be highly efficient.  Again, less I/O to the OS (which probably also has a cache) or to the actual device.

    Is it possible to rebuilt file1 and file2 so that 7 mins performance could be obtained ?

    It is possible to obtain some of the efficiencies described above.  The recovery utility (also called recover1) will sort all the B-trees as it reads through the file's data blocks.  You can use the M option to increase the memory size used by the sort algorithm, to further optimize the B-tree(s).  This achieves the efficiency described in the first point above (and for all keys, not just the prime key).

    If you specify the T option, recovery will move data blocks from the near the physical end-of-file to available blocks nearer the beginning of the file.  The end-of-file is then moved to return unused blocks at the end to the operating system.

    recovery does not reorder data blocks to obtain the cache efficiency of the second point.  If reading through a large file in prime key order is a significant requirement, then you must use a COBOL program to achieve the data block caching efficiency.

    My expectation is that recovery will improve your compare program performance, but not to the extent that you see after a bespoke COBOL program does the copy.

    EDIT:  I forgot to mention one thing that probably does not matter to most folks.  The COBOL standard requires, for reading on alternate keys that have duplicates, that multiple records containing the same key value be returned in temporal order, oldest first.  The bespoke COBOL program will not preserve the original temporal order, unless the prime key happens to order records in temporal order (such as date-time or a sequence number being the leftmost characters in the prime key).  recovery does preserve the order for alternate keys with duplicates.

Children