New Ranks & Badges For The Community!
Notice something different? The ranks and associated badges have gone "Star Fleet". See what they all mean HERE
Highlighted
Absent Member.. Absent Member..
Absent Member..
2504 views

Got unexpected close from RMA

We are experiencing "Got unexpected close from RMA" Major error for a while now and it is becoming really frustrating. It happens randomly, sometimes at 5%, sometimes at 17% sometimes at 50%. The error apears when we try to write the data from disk to tape. What can be wrong, and we did not change anything in the backup specification?

We are using HP Data Protector 6.20 on linux redhat

 

Thanks in advance.

0 Likes
5 Replies
Highlighted
Micro Focus Expert
Micro Focus Expert

Hi Andrej,

 

Maybe checking messages file or windows event logs in the server where the RMA is running, could help to understand why the Media agent is closing the connection.

 

You say this is a disk to tape copy job. Are the source and target media agents running locally in the same server? Or is the network involved in this copy?

If the network is involved, you could try running the copy job locally,and see if it still fails.

 

If nothing of this helps, I would suggest you to open a Data Protector support case, so debugs from all the agents involved can be analyzed to understand why this is happening.

 

Thanks

Regards

Juanjo

0 Likes
Highlighted
Vice Admiral Vice Admiral
Vice Admiral

Hello Andrej,

 

If you have an xMA unpredictably dropping its socket connection, you could be facing a system resource or network stability issue.

 

I gather from the first response that your issue crops up during a D2T copy.  That means that you have RMAs reading the source data and passing it to BMAs writing to destination media.  All the while, there is control comunication between the xMAs and CSM on the cell manager with most of that being catalog updates from the BMAs.

 

Are you running all of the agents on the cell manager?  Or do you have a dedicated media server?  Are the RMAs and BMAs running on the same server?  Or are you passing all of that data across a network connection between machines?  Check /var/log/messages for entries that correlate with the times that you've had RMA problems.

 

Posting a complete set of session messages from a failed session would be helpful.

 

Thanks,

Mr_T

DPTIPS


-----
Was this information helpful? If so, please like and give kudos. Thanks!
0 Likes
Highlighted
Absent Member.. Absent Member..
Absent Member..

Mt T:

Here is the current error from the log : please enighten whats gong on ?

[root@ret-rh1p log]# tail messages
Nov 9 10:46:13 ret-rh1p xinetd[6008]: START: omni pid=55085 from=::ffff:10.21.3.206
Nov 9 10:46:13 ret-rh1p OB2DBG_StoreOnceSoftware_Debug.txt[64970]: -0800 2016-11-09 10:46:13 StoreOnceSoftware is within the memory capacity limit
Nov 9 10:46:14 ret-rh1p xinetd[6008]: EXIT: omni status=0 pid=55081 duration=2(sec)
Nov 9 10:46:14 ret-rh1p kernel: rma[55085]: segfault at 7fc84ef7c000 ip 00007fc84f4343dc sp 00007ffd98f9de30 error 4 in libserializer_64bit.so[7fc84f408000+52000]
Nov 9 10:46:14 ret-rh1p abrtd: Directory 'ccpp-2016-11-09-10:46:14-55085' creation detected
Nov 9 10:46:14 ret-rh1p abrt[55100]: Saved core dump of pid 55085 (/opt/omni/lbin/rma) to /var/spool/abrt/ccpp-2016-11-09-10:46:14-55085 (80351232 bytes)
Nov 9 10:46:14 ret-rh1p xinetd[6008]: EXIT: omni signal=11 pid=55085 duration=1(sec)
Nov 9 10:46:14 ret-rh1p abrtd: Package 'OB2-MA' isn't signed with proper key
Nov 9 10:46:14 ret-rh1p abrtd: 'post-create' on '/var/spool/abrt/ccpp-2016-11-09-10:46:14-55085' exited with 1
Nov 9 10:46:14 ret-rh1p abrtd: Deleting problem directory '/var/spool/abrt/ccpp-2016-11-09-10:46:14-55085'

0 Likes
Highlighted
Absent Member.. Absent Member..
Absent Member..

you are getting a RMA crash therefore I would configure redhat to capture the core file, then send to HP via a support case.

0 Likes
Highlighted
Cadet 2nd Class Cadet 2nd Class
Cadet 2nd Class


@GrassHopper wrote:
Nov 9 10:46:14 ret-rh1p kernel: rma[55085]: segfault at 7fc84ef7c000 ip 00007fc84f4343dc sp 00007ffd98f9de30 error 4 in libserializer_64bit.so[7fc84f408000+52000]

Having the RMA segfault in the serializer lib (either during copies or during restores) for StoreOnceSoftware objects that otherwise succeed a Verify is exactly what I had here, after upgrading to 9.06, but on Windows. I've been working with HPE on a case for that and finally was told the bug in question should be fixed in 9.08. I just upgraded and indeed, the very same objects that made the RMA crash before are now copied without any fuzz. So if you are on 9.06 or 9.06_108 (dunno about 9.07), upping to 9.08 might be a good way to get rid of that crash (provided we are talking about the same final cause).

HTH,
Andre.

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.