How to detect and avoid tape shoe-shining?

Hi,

 

I have DP 9.00 with HP StoreEver MSL 2024 G3 with one LTO-4 drive connected with SCSI to Windows 2008 R2 Server. Reading LTO4 spec, the tape should be fully written (or read) in about two hours but my backups take much longer. I don't know if there is a possibility to detect tape shoe-shining or if it is just guessing by backup speed. My test sessions:

 

Session 1
Disk Agent 1 took 217 minutes to write 570GB data, 44 MB / s (disk should do 60 MB/s)
Disk Agent 2 took 66 minutes to write 180GB data, 46 MB /s (disk should do 100 MB/s)
Total time 217 minutes, 750GB

 

Session 2
Disk Agent 1 too 140 minutes to write 700GB, at 70 MB/s (disk should do 200MB/s)

 

Session 3 (combines sources of Session 1 and Session 2)
Omnispeed shows 123 MB/s
..running just now, waiting for result

 

In all cases the disk agents and media agent are on the same machine, cell is on another machine, CPU is low.

Does it indicate shoe-shining? And solution is to run more disk agents (if possible) to feed enough data to LTO?

 

How does HW compression change the picture? Can I see statistics of the compression ratio? E.g. I can have 800GB of data on tape, but I don't know if the tape is full or just half filled. I could have data rate of 60MB/s but as they are highly compressible, the actual tape rate could be 20MB/s or lower.

 

I would be grateful for any information, I did spend some time googling but found nothing to answer my questions.

 

Kind regards, Jan

  • Verified Answer

    I'd say there is a very good chance that your tape drive is shoe-shining.  This is probably the main cause of performance problems. I am attaching a PowerPoint presentation that one of my colleagues prepared that explains this in detail,

     

    But, the basics are that today's tape drives will typically write much faster than data can be delivered.  Once the tape buffer is full, the drive stops, waits for the buffer to fill, reposisteions the tape, and starts writing again

     

    How do you overcome this?  By increasing the Concurrency

     

    Data Protector backs up objects, or, in other terms, Mounted File Systems.  Each object is a Data Stream.  You can have multiple, concurrent data streams being written to a device at the same time.  The more streams, the less chance of Shoe-Shining

     

    DP has a default Concurrency  of 4.  For anything above, and including, LTO-4, I usually recommend a concurrency of 10 or 12 or more.  However, this depends of whether you have 10 or 12 data streams or Objects that can be sent to the tape drive.  If, for example, you are backing up a Windows server containing 3 mount points /C, /D, and /E, plus the /CONFIGURATION object, that is 4 data streams, no matter how high you have concurrency set.  This will probably result in slow tape drive performance

     

    If you add another server, with 4 objects, you have a total of 8 data streams, and, you lessen the chance of Shoe-Shining.  You will probably see teh backup slow down at the end as the number of data streams available becomes less

     

    Since, by default, REstore concurrency is set to 1, an increased backup concurrency will result in longer restore times

     

    If you had a Virtual Tape Library, where you write to disk, set teh concurrency to 1, since there is no issue with tape drive streaming

    Backup Performance brownbag August 2008 20080828.zip
  • Thank you for the presentation, description and suggestions, it helped a lot.

    As for the restore concurrency and longer restore times for backups with increased concurrency, I assume the reason is that the data are scattered over longer part of the tape.

    Also, during the restore, unless restoring one backup stream (roughly), the tape will always be shoe-shining?
  • When I wrote my comments about shoe-shining, it was for 'write' activity only, I can't say that the same will be true when doing a Restore, which is a 'read' activity

     

    When you do a Restore, the 'mpos' information is read from the IDB.  This is the position on the tape of the file you are trying to restore.  This is sent to the tape drive which moves the tape to the proper segment header, and the file is located from here.  Because the entire tape may have to be read, this can take some time

     

    The higher the concurrency, the more data is interleaved onto the tape, and the longer it will take to find it

     

    If doing a lot of Restores, a low concurrency, maybe even '1', can be used, as long as the effects on Backup performance is understood.  It all depends on which is more important

  • IMHO, shoe-shining during restore depends on what you restore and how are data interleaved on tape. For large files the low level on backup concurency migh in fact result in longer restore times as tape drive might not able to write so fast to the restore location, tape will stop, rewind and start that same way as with tape write.

     

    Kind regards, Jan

  • IMHO, shoe-shining during restore depends on what you restore and how are data interleaved on tape. For large files the low level on backup concurency migh in fact result in longer restore times as tape drive might not able to write so fast to the restore location, tape will stop, rewind and start that same way as with tape write.

     

    Kind regards, Jan

  • IMHO, shoe-shining during restore depends on what you restore and how are data interleaved on tape. For large files the low level on backup concurency migh in fact result in longer restore times as tape drive might not able to write so fast to the restore location, tape will stop, rewind and start that same way as with tape write.

     

    Kind regards, Jan