Absent Member.. PolarnPer Absent Member..
Absent Member..
173 views

StoreOnce 5100 goes offline / disabled when running a certain backup job.

When I'm running two different jobs on the daily schedule, they have now started to fail with the below messages.

This is during the night when other backups are also running.

When trying the same job during daytime, manually or on a schedule, the backup runs through fine. All the settings are the same of course.

The StoreOnce device works OK for other jobs during the same time...

I also noticed the Load Balancing message, but the setup is similar to other jobs which are running fine. I have also tried to change the number of devices, but without any success.

The amount of concurrent Copy jobs / inbound / outbound on the StoreOnce 5100 is set to 320, and should not be anywhere near the amount of jobs that I'm running at any time.

I guess my question is: Why does it fail sometimes, but not when I'm running it during other hours, trying to troubleshoot?

 

Error session details:

[Major] From: BSM@cellserver.domain.net "Client_File_Daily_Incr_Weekly_Incr9_Monthly_Full_5_Weeks"  Time: 2/22/2017 10:30:08 PM
 Got error: "StoreOnce error: StoreOnce device offline, network error occurred or secure communication failed while contacting the StoreOnce device" when contacting "5100_CoFC:Catalyst_01" B2D device!

[Major] From: BSM@cellserver.domain.net "Client_File_Daily_Incr_Weekly_Incr9_Monthly_Full_5_Weeks"  Time: 2/22/2017 10:30:45 PM
 Got error: "StoreOnce error: StoreOnce device offline, network error occurred or secure communication failed while contacting the StoreOnce device" when contacting "5100_CoFC:Catalyst_01" B2D device!

[Warning] From: BSM@cellserver.domain.net "Client_File_Daily_Incr_Weekly_Incr9_Monthly_Full_5_Weeks"  Time: 2/22/2017 10:30:45 PM
 Device cellserver:5100_CoFC:Catalyst_01_gw is disabled and will not be used.

[Warning] From: BSM@cellserver.domain.net "Client_File_Daily_Incr_Weekly_Incr9_Monthly_Full_5_Weeks"  Time: 2/22/2017 10:30:45 PM
 Number of devices used is smaller than
 MIN value of load balancing.

[Critical] From: BSM@cellserver.domain.net "Client_File_Daily_Incr_Weekly_Incr9_Monthly_Full_5_Weeks"  Time: 2/22/2017 10:30:45 PM
 None of the Disk Agents completed successfully.
 Session has failed.

 

Success session message:


COMPLETED Media Agent "cellserver:5100_CoFC:Catalyst_01_gw [GW 23932:3:7110645425682409830]"

 

0 Likes
7 Replies
antaln Outstanding Contributor.
Outstanding Contributor.

Re: StoreOnce 5100 goes offline / disabled when running a certain backup job.

It's hard to tell from this output, but it's clearly a transient issue.

Did you make sure that the StoreOnce connectivity from the gateway host is not disrupted during the night?

0 Likes
Luc Minnaert Outstanding Contributor.
Outstanding Contributor.

Re: StoreOnce 5100 goes offline / disabled when running a certain backup job.

Hello,

you must be hitting an appliance limit.

So start by limiting some jobs by intrducing limits on store/gateway/.... level

This might negatively impact your backup performance but you'll prevent a "going offline". Afterwards when you analyse via your session reports you "big consumers", you can increase again

and via this detour you might find out what's causing the issue.

Luc Minnaert

0 Likes
Absent Member.. PolarnPer Absent Member..
Absent Member..

Re: StoreOnce 5100 goes offline / disabled when running a certain backup job.

I don't think there are any network issues, although I'm not saying it can't have anything to do with it.

The Cell and Client are on the same VLAN, but the StoreOnce is on another. There's only a 1 Gbit NIC on the Client.

Will check the logs in the switches and have a look at the load there.

Thanks for the advice.

0 Likes
Highlighted
Luc Minnaert Outstanding Contributor.
Outstanding Contributor.

Re: StoreOnce 5100 goes offline / disabled when running a certain backup job.

Sorry, I was not talking of a network issue

re-read my previous input

If you go beyound max value's of your appliance in general/ per store/ per gateway/max number of login per node/etc/etc/......

you might see this behaviour which is seen to DP as a connection lost.

Luc

0 Likes
Absent Member.. PolarnPer Absent Member..
Absent Member..

Re: StoreOnce 5100 goes offline / disabled when running a certain backup job.

Hi Luc,

I have checked the total amount of concurrent devices that are used in total against this StoreOnce, and in total I have 59 possible connections. However, these don't occur simultaniously.

The limit on the StoreOnce seems to be 320 concurrent jobs and Copy jobs, but I don't know if there are some other limitations I'm not aware of. The limit of Copy jobs is set to 32, and there might be something there?

I will try to decrease the amount of devices used around the time this job runs and see if that helps.

Thank you for your contribution.

Kind regards

/ Per

 

0 Likes
Luc Minnaert Outstanding Contributor.
Outstanding Contributor.

Re: StoreOnce 5100 goes offline / disabled when running a certain backup job.

There are lots of limits you  might hit  (check the release note of the FW your appliance is running)

but another thing which might lead to a similar behavior are quota's.

did you put quota's on some store's?

Luc

0 Likes
Absent Member.. PolarnPer Absent Member..
Absent Member..

Re: StoreOnce 5100 goes offline / disabled when running a certain backup job.

Hi Luc,

There are no limits related to this that I could find in the release notes of FW 3.15.1 that I'm running.

There are also no quotas set in DP.

I did decrease the amount of Load Balancing (Min / Max) devices a bit, used for both this specific job and other jobs running during the same time period yesterday, and last night, the job actually went through!

What I did find though, when looking through all the gateways, was that under this gateway - Settings - Advanced - Sizes, the Segment Size (MB), was different on this gateway compared to the others used towards this store.

This has however not affected any other backup job that has used the same gateway.

I just set the same Segment Size on the gateway, so all gateways now use the same size (10 000 MB).

Even though I can't really find any reason why it hit this kind of limit (unspecified so far), it seems to be there and causing issues. The job has only went through once so far, so I hope it will continue to do so. Otherwise I know where to look to troubleshoot furhter now! 🙂

Thank you both for your inputs!

Kind regards

/ Per

 

 

 

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.