ZMF SYSMDUMPs – the why’s, how’s, do’s and do not’s...

0 Likes
over 5 years ago

As much as we may hate to admit it, problems do sometimes arise in the ZMF Server started task that may lead to an abend (and I hasten to stress that these are not always Serena's fault!).  Also, back in the dim and distant past that coincided with the release of ZMF 7.1, the recommended started task dump file contained in the SERVER sample CNTL member changed from the tried and trusted:

//SYSUDUMP  DD SYSOUT=*                             *Abend list        

to the new (and slightly mysterious for some):

//SYSMDUMP  DD DISP=(MOD,CATLG,CATLG),             * SYSMDUMP         
//             DSN=somnode.SERCOMC.SYSMDUMP( 1),                      
//             UNIT=SYSDA,SPACE=(CYL,(200,100),RLSE),                 
//             DCB=(DSORG=PS,RECFM=FBS,LRECL=4160,BLKSIZE=4160)       

Many of our customers are comfortable with SYSMDUMPs.  However, if you are unfamiliar with this format, have never made the change to your started tasks’ JCL, are wondering why this happened, how these should be handled or are interested in some of the common mistakes we see customers making with them, then please read on.

1. So, firstly, why the change?

It’s very simple really: SYSMDUMPs contain significantly more useful information in diagnosing problems than their SYSUDUMP alternative.  

The ZMF Server started task is a complex piece of software that manages multiple subtasks within a single address space.  As additional functionality is added to both z/OS and ZMF over time, SYSMDUMPs are the best option in trying to ensure that all information that could be relevant to any given problem within the started task is captured at the first time of asking.  

In the past, if a problem was particularly complex we would sometimes ask customers to set a SLIP trap to capture an SVC dump if/when a problem reoccurs.  Even with a SYSMDUMP specified, this may still be appropriate in certain situations.  However, if we start with a SYSMDUMP much of the information that will be captured by the SVC SLIP will already be available, maximising the chances of diagnosing any problems with the fewest number of iterations. 

2. Must I make this change?  Can I stick with my SYSUDUMP files? 

The simple answers are “no, you do not have to make this change” and “yes, you can stick with your SYSUDUMP files”.  However, it is very much in everyone’s best interests if you do switch over to SYSMDUMPs in the started task.  

Serena will always research and attempt to diagnose any ZMF problem from a SYSUDUMP if that is all that is available.  However, and as previously stated, both the operating system and ZMF itself are continually evolving and becoming more complex in the process.  SYSUDUMPs often do not catch sufficient diagnostic information that will allow us to resolve a problem upon first occurrence, thus elongating the time required to resolve any problems.  So, to minimse this diagnosis period, if you have not already switched over to using a SYSMDUMP in your started task JCL it is highly recommended that you do so at the earliest opportunity.

3. What do I do with a SYSMDUMP?

If a problem occurs and a SYSMDUMP is captured you will typically see the following type of messages in your started task’s JESMSGLG output file:

06.13.44 S0919451  SER0953E Task abnormally terminated: Comp=S0C4 Function=SCAN/CONTINUE NSI=15836EE0  
06.14.21 S0919451  IEA995I SYMPTOM DUMP OUTPUT  310                                                    
   310             SYSTEM COMPLETION CODE=0C4  REASON CODE=00000010                                    
   310              TIME=06.13.44  SEQ=07487  CPU=0000  ASID=0167                                      
   310              PSW AT TIME OF ERROR  078D0000   95836EE0  ILC 4  INTC 10                          
   310                ACTIVE LOAD MODULE           ADDRESS=15834B08  OFFSET=000023D8                   
   310                NAME=SERBSAM                                                                     
   310                DATA AT PSW  15836EDA - 9024A0B8  18415830  400C1813                             
   310                AR/GR 0: 00000000/11000000   1: 00000000/5CDDC010                                
   310                      2: 00000000/0000001B   3: 00000000/00000044                                
   310                      4: 00000000/5CDDC010   5: 00000000/00000004                                
   310                      6: 00000000/23CAA000   7: 00000000/23CD8000                                
   310                      8: 00000000/9586C09C   9: 00000000/95837272                                
   310                      A: 00000000/00000000   B: 00000000/15838B18                                
   310                      C: 00000000/23CA9000   D: 00000000/23CB8170                                
   310                      E: 00000000/9586A970   F: 01000002/95834CA2                                
   310              END OF SYMPTOM DUMP                                                                
06.14.21 S0919451  IEA993I SYSMDUMP TAKEN TO CMNSUP.INTL.CMN4.SYSMDUMP.G0111V00                        

Obviously you should start investigation into any such problem by following standard diagnostic procedures:  

  • Check the messages in the SYSLOG and started task output (e.g. the JESMSGLG, SERPRINT, SYSPRINT, etc.).  Could any be related to or even explain the abend?  
  • Were any jobs running at the time that may have triggered the abend?  
  • Consider what has changed in the system or ZMF instance that may explain the cause of the abend?  
  • Look up the dump information and any unusual or unexpected messages in the appropriate manuals or web sites, including the Serena Support web site Knowledgebase.

If the issue cannot be satisfactorily diagnosed and resolved after this investigation then open a case with Serena Support.  At this point simply supply a description of the problem, its impact (if known) and the SYMPTOM DUMP information above (i.e. everything between the ‘IEA995I SYMPTOM DUMP OUTPUT’ and the ‘END OF SYMPTOM DUMP’ messages above).  If the issue can be tracked back to a triggering event, for example the execution of a specific batch job, then send this output or information in at this time, too.   

Serena Support will assess the supplied information to see if this is a known issue and feedback accordingly.  If the problem cannot be diagnosed at this stage then Serena Support will request that you send us the started task output, the SYSMDUMP dataset and any related output that has not already been supplied (e.g. triggering batch job output) which will be required to progress investigation.

Getting the SYSMDUMP to Serena

To send us the information required to diagnose the problem, please use the following process:

a. Compress the dump file using the PACK option of IBM's TRSMAIN utility.  Sample JCL based on the previous dump example follows:

//(insert jobcard)
//*                                                           
//STEP     EXEC PGM=TRSMAIN,PARM=PACK                         
//SYSPRINT DD   SYSOUT=*                                      
//INFILE   DD   DISP=SHR,DSN=CMNSUP.INTL.CMN4.SYSMDUMP(-1)     
//OUTFILE  DD   DISP=(NEW,CATLG),UNIT=SYSDA,                  
//       DSN=CMNSUP.INTL.CMN4.SYSMDUMP.TRS,                   
//       SPACE=(CYL,(50,50),RLSE)                             

Note: The started task will need to be stopped before the dump file is released.  In this example, the task has already been restarted so we are PACKing the -1 generation of the dataset. If the task had not yet been restarted we would have PACKed the 0 generation of the dump dataset.

This job will create a new dataset named ‘CMNSUP.INTL.CMN4.SYSMDUMP.TRS’ which contains the packed dump records with a fixed length of 1024 bytes.  

b. When the dump was requested, Serena Support will also have supplied credentials to connect to a FTP site.  The PACKed dump file will need to be sent to this site using the supplied credentials.  There are several different manners in which this can be achieved.  Some sites may allow transmission to the Serena FTP site directly from their existing mainframe environment.  Others may require that the file be downloaded to the user’s distributed environment and sent via web browser or FTP Client software.  However, the important factors here are that:

  • the dataset should always be treated as a binary file.
  • no EBCDIC/ASCII data conversion processing should be executed against it.
  • no Carriage Return or Line Feed (CR/LF) options should be specified during any transmission process.
  • the file should never be renamed to specify a ‘.txt’ or any similar, common desktop file-type suffix.  
  • the data contains fixed, 1024 byte records, if this information is required.

Sample JCL to send 'CMNSUP.INTL.CMN4.SYSMDUMP.TRS' directly from the mainframe to the FTP site and rename it to a file named CMNSUP.INTL.CMN4.SYSMDUMP.TRS (no quote marks) follows:

//(insert jobcard)
//*                                                    
//FTPIBM   EXEC PGM=FTP,REGION=4M                      
//SYSPRINT DD   SYSOUT=*                               
//OUTPUT   DD   SYSOUT=*                               
//INPUT    DD   *                                      
ftp.serena.com                                         
(supplied FTP site userid)                                              
(supplied FTP site password)
binary                                                 
put 'CMNSUP.INTL.CMN4.SYSMDUMP.TRS'                    
    CMNSUP.INTL.CMN4.SYSMDUMP.TRS                      
quit                                                   

Note: Many sites have varying rules and restrictions on the FTP sites that can be reached from both their mainframe and distributed environments.  If in any doubt about your site’s restrictions or requirements then please check with your Systems Programming team who will probably have the most experience in this area.

c. Also send us ALL of the accompanying job or started task output either to the FTP site or by attaching it to the case, if you have not already done so.

d. Let us know once transmission is complete.  Our FTP sites are not actively monitored in the same manner as case updates and attachments.  If you do not tell us you have sent a file to the FTP site then we may never actually know.

e. And please supply the dump size information covered by KB article S133277 on the Serena Support website.  Several times in the past we have been supplied with incomplete dump files from customer sites.  A trait of SYSMDUMPs is that it is not too easy to identify that files have been truncated somewhere during the compression or transmission processes.  To do this:

  • invoke the Interactive Problem Control System (IPCS) application in your ISPF environment (the option and location will vary from site to site – again, check with your Systems Programmers if in doubt).
  • in IPCS Option 0 (Defaults) enter the Source DSNAME in the following manner:
BLSPSETD ---------------- IPCS Default Values ---------------------------------
Command ===>                                                                   
                                                                               
  You may change any of the defaults listed below.  The defaults shown before  
  any changes are LOCAL.  Change scope to GLOBAL to display global defaults.   
                                                                               
  Scope   ==> LOCAL   (LOCAL, GLOBAL, or BOTH)                                 
                                                                               
  If you change the Source default, IPCS will display the current default      
  Address Space for the new source and will ignore any data entered in         
  the Address Space field.                                                     
                                                                               
  Source  ==> DSNAME('CMNSUP.INTL.CMN4.SYSMDUMP.G0111V00')                     
  Address Space   ==>                                                          
  Message Routing ==>                                                          
  Message Control ==>                                                          
  Display Content ==>                                                          
                                                                               
Press ENTER to update defaults.                                                
                                                                               
Use the END command to exit without an update.                                 
  • press the < enter > key.
  • press the < PF3 > key to return to the IPCS PRIMARY OPTION MENU and select option 1 (Browse).
  • in panel BLSPOPT press < enter > to browse the dump.  The following type of messages will be issued:
 IKJ56650I TIME-03:14:17 AM. CPU-00:00:12 SERVICE-19546281 SESSION-02:46:48 DECEMBER 16,2015   
 BLS18122I Initialization in progress for DSNAME('CMNSUP.INTL.CMN4.SYSMDUMP.G0111V00')     
 BLS18124I TITLE=JOBNAME SERSUPI4 STEPNAME SER4             SYSTEM 0C4                         
 BLS18223I Dump written by z/OS 01.13.00-0 SYSMDUMP - level same as IPCS level                 
 BLS18222I z/Architecture mode system                                                          
 BLS18160D May summary dump data be used by dump access?  Enter Y to use, N to bypass.         
  • type ‘Y’ and press < enter >.
  • once complete, press < PF3 > twice to return to the IPCS PRIMARY OPTION MENU.
  • select option 4 (Inventory).
  • next to your current dump file type the ‘lz’ line command:
BLSPDUIN NTORY - SNEVIN.DDIR -----------------------------------------------------------------------
Command ===>                                                                       SCROLL ===> CSR  
                                                                                                                                   
AC Dump Source                                                       Status                                                        
lz DSNAME('CMNSUP.INTL.CMN4.SYSMDUMP.G0111V00')  . . . . . . . . . . OPEN                                                          
   Title=JOBNAME SERSUPI4 STEPNAME SER4             SYSTEM 0C4                                                                     
   Psym=RIDS/SERBSAM#L RIDS/#UNKNOWN AB/S00C4 VALU/H400C1813 REGS/B1C38 PRCS/00000010                                              
  • press < enter > and the following type of information will be displayed:
BLSPNTRC UT STREAM ------------------------------------------------------------------ Line 0 Cols 1 130 
Command ===>                                                                           SCROLL ===> CSR  
 ******************************************* TOP OF DATA ************************************************
                                                                                                                                   
 Source of Dump                                                            Blocks               Bytes                              
 DSNAME('CMNSUP.INTL.CMN4.SYSMDUMP.G0111V00')  . .  . . . . . . . . . . .  24,858  . . .  103,409,280                              
                                                                                                                                   
   ABSOLUTE                                                                                                                        
     0221D000.:0221DFFF. RECORD(15984) POSITIONS(64:4159) ABSOLUTE                                                                 
     03E04000.:03E05FFF. RECORD(2977:2978) POSITIONS(64:4159) ABSOLUTE                                                             
...

- Send us the first 3 lines of data displayed on this panel, BLSPNTRC. 

4. Common mistakes?

As with any changes to existing functionality supplied in sample JCL, there is always room for the changes to be incorrectly applied to the components that are running in the live environment.  In the context of this particular article, this is the started task JCL running in your system procedure library.  Obviously many different problems are possible when converting your tasks from SYSUDUMPs to SYSMDUMPs.  However, a couple exist which seem to have affected several different customers.  Therefore we thought it useful to point them out here.

a) Writing SYSMDUMP data to SYSOUT classes

SYSMDUMP data is only useful when written to a dataset.  Writing this data to SYSOUT=* (or any specific SYSOUT class) is of no use to anyone and the data cannot be used for diagnostic purposes.  In fact, it is only liable to get your Operations teams complaining to you when you start filling their spool datasets and output management software with data that is of absolutely no use to anyone.  So, under no circumstances write SYSMDUMP data to a SYSOUT class.  If you do not wish to convert to a dump dataset for any reason then you would be better advised to stick with a SYSUDUMP file in your started task JCL.  

b) Ensure that you code the SYSMDUMP DD statement with DISP=MOD 

As stated at the start of this article, the sample SYSMDUMP file is coded with ‘DISP=(MOD,CATLG,CATLG)’:

 //SYSMDUMP  DD DISP=(MOD,CATLG,CATLG),             * SYSMDUMP         
//             DSN=somnode.SERCOMC.SYSMDUMP( 1),                      
//             UNIT=SYSDA,SPACE=(CYL,(200,100),RLSE),                 
//             DCB=(DSORG=PS,RECFM=FBS,LRECL=4160,BLKSIZE=4160)       

However, in several customer sites this has mutated to a ‘DISP=(NEW,CATLG,CATLG)’ or ‘DISP=(,CATLG,CATLG)’ when applied to the started task JCL.  This is not good.  

If the SYSMDUMP dataset is coded with a NEW disposition, only the very last abend in the started task will be captured as the dump processing will continually overwrite the data already resident in the file.  When any abend occurs in the started task it is liable to leave threads or subtasks hanging.  So, when the task is subsequently shutdown these threads are particularly prone to other, related abends, such as S33E or S0C1s, when they are terminated.  If DISP=NEW has been coded or defaulted, this shutdown abend will overwrite the existing contents of the dump file, obliterating any earlier dumps in the process.  We will therefore have no dump data available covering the earlier abend(s) to assist in diagnosis.

c) Allocate the dump file as a Generation Data Group (GDG) dataset 

We have also seen some customers define the SYSMDUMP dataset as a fixed-name dataset rather than a GDG.  For basically the same reasons as those already covered in the DISP=MOD discussion, doing this is not ideal.  In the activity (a.k.a. chaos) that can ensue after a ZMF abend it would be easy to forget to copy the dump file out to another dataset before restarting the ZMF task.  Obviously making the system re-available to users will be the highest priority.  However, this could at best also result in the dump file being unavailable to be sent to us for investigation until the task is next stopped.  At worst, you will again be risking overwriting of the data that it contains before a copy can be taken.  Therefore, it is much safer and will ultimately ease the management of the dataset if it is allocated as a GDG file.   

5. In conclusion

Hopefully this article sheds some light on the reasons for the ZMF Server started task JCL conversion to SYSMDUMPs and/or the procedures required to handle them.  Remember, this change applies to the ZMF Server started task JCL only.  For most other ZMF-related JCL, and unless stated otherwise, SYSUDUMP or SYSABEND dump files are entirely sufficient.  And if you have inadvertently implemented your SYSMDUMPs with any of the common errors documented above, please correct them before they complicate progression of any issues that you may encounter in the future.

 

Labels:

How To-Best Practice
Comment List
Anonymous
Related Discussions
Recommended