the problem appears mostly for incremental backup
[Critical] From: VEPALIB_VMWARE@backuphost1 "" Time: 04/15/2020 03:49:32 AM
Backup of object failed.
[Major] From: BSM@cellmanager "backup1" Time: 4/15/2020 3:49:33 AM
[61:1005] Got unexpected close from OB2BAR Backup DA on backuphost1
Whenever we got “expected close”, we should think about the software’s crash. In this case, vepa_bar is the crashed one.
The reason for the crash is various. However, we would focus on the resources supporting the backup host.
I notice the failed session report is incremental backup which sometime stress CPU and other resources to calculate the CBT.
However, in some special cases, there are things that we cannot fully understand where problem comes from, DP or not DP.
So we need to isolating things out, step by step.
From DP end, there are several tuning parameters for this module. Probably the most appropriate one in omnirc is:
# OB2_VEAGENT_DISK_CONCURRENCY=<vm disk concurrency>
# Default: 10
# This setting is used to calculate a safe number of disk threads to be started
# in parallel. This is the total number of disks that can be run in parallel at any
# given time. The disks selected are from any of the running VM threads.
# Number of disk threads = VM disk concurrency - 10% (but at least one thread)
# Each of the disk thread creates the connection to the vCenter apart from the running
# VM thread. Also take into account that each vSphere Client opens a connection too.
By lowering the disk threads, like =5 on both backup hosts, problem gone away.