Highlighted
Valued Contributor.
Valued Contributor.
806 views

Changing from 256 to 512 block size, optimizing backups, lots of Poor tape showing up

I recently upgraded from DP7 to DP9 as well as from LTO4 to LTO7 tape.

After getting everything migrated and running I started making changes to optimize the backup speeds. Changing block size and concurrency significantly increased the speed (avg 50MB/s to 150MB/s) but I started having other issues. I immediately started noticing Poor quality tapes showing up when that had been a rare occurrence before. Currently I have two tape libraries and at any time about 20% of my tapes show as Poor. These are almost all part of my media pool that is exclusively used for database (SAP) backups. 

I'm also starting to have some backups fail with a writing to block error. I assume this is due to having some 256 backup data still on a tape then trying to change to 512. What is the method I should use to make this transition seamlessly?

Also, does anyone have a good explanation of concurrency? What should it be set at for say, a DB backup (a few very large files) versus a FS backup (a lot of very small files)?

Any other advice would be helpful.

0 Likes
4 Replies
Highlighted
Outstanding Contributor.
Outstanding Contributor.

I'm also starting to have some backups fail with a writing to block error. I assume this is due to having some 256 backup data still on a tape then trying to change to 512. What is the method I should use to make this transition seamlessly?

Object copy all object versions on media that use old block size to new media (using device with new block size), then recycle the old media.

Also, does anyone have a good explanation of concurrency? What should it be set at for say, a DB backup (a few very large files) versus a FS backup (a lot of very small files)?

Concurrency determines the number of objects that can be written to a device at the same time. You generally set concurrency > 1 if device cannot reach its full throughput with a single object, and assigning additional objects to the device would increase throughput at set concurrency. At certain point, you may reach diminishing returns.
There is no specific value I can suggest - you have to benchmark and tune settings to get the best effect. This applies to both databases and filesystems.
Some databases always set concurrency=1 to ensure that each database stream is independently seekable during restore. This is in place because data records for multiple objects on media backed up with concurrency > 1 are interleaved, which precludes independent seeks (Media Agent reads multiple interleaved streams at once, not each one independenty).

0 Likes
Highlighted
Valued Contributor.
Valued Contributor.

Thank you for that information.

Would a source of Poor tapes potentially be too high of concurrency?

0 Likes
Highlighted
Outstanding Contributor.
Outstanding Contributor.

Would a source of Poor tapes potentially be too high of concurrency?

I don't think so. Concurrency is just a construct made possible by Data Protector's format of data on medium, not a property of a (physical) device.

Poor medium state is a response to Media Agent sending a medium error indication when interacting with the device.

You said you observed this Poor state on the pool used for SAP backups. You should look at any SAP sessions that have any errors reported by BMA, and see if these errors offer any more insight.

0 Likes
Highlighted
Visitor.

Hi,


@JadedS wrote:
I'm also starting to have some backups fail with a writing to block error. I assume this is due to having some 256 backup data still on a tape then trying to change to 512. What is the method I should use to make this transition seamlessly?

In my experience (and I collected some of it migrating all sorts of installations from 64KiB to larger blocksizes, including 512KiB) the transition is seamless out of the box. When an expired medium is overwritten, the block size changes without a hitch. The small set of media in a pool which are still appendable will fail to be appended to, but that just means the tape is ejected, another tape from the pool is loaded - inevitably you will hit an expired/blank/none tape after some attempts - and writing continues. The tape that could not be appended to will lose its position in the write sequence and most likely will never be appended to again, until it expires (it's not marked Poor though, it just "falls out of fashion" for a while). After expiry it will seamlessly overwrite like the others before. It just takes one media retention cycle of normal operations and you are at the new block size.

Regarding the Poor media problem, your observations are atypical IMO. You may run into specific bugs, but they are usually obvious. Like, say, HBAs not allowing larger block sizes (the classic Zero Memory SmartArray glitch). Unix hosts (including Linux) need a special setting to consider themselves allowed to write blocks >256KiB. Larger blocks may stress the SCSI/FC infrastructure more intensely than before (even though the lowered block rate is a net win on most everything else). As to why it happens, you may have to dig further.

HTH,
Andre.

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.