Buffered tape marks in CASTOR

Introduction

“Buffered” or non-synchronizing tape marks have been identified as an alternative to a new CASTOR tape format. “buffered” means avoiding tape stopping and synchronization (equivalent to fsync on disk) which increases throughput and reduces tape wear (“shoe-shining”). Buffered tape marks are part of the SCSI standard since SCSI-2 (via the ‘immed’ bit in the 0x10 filemark CDB) and are supported by standard drives including Sun and IBM (both enterprise+LTO). Using this functionality is extremely beneficial in terms of performance, as it is possible to issue tape mark separations for every file, but effectively commit the files (with a synchronizing tape mark) after having written N files, and/or N GB of data for example. Another advantage of using buffered filemarks is that there is no change needed in the format of the data written to tape.

  • Support for buffered tape marks via the Linux kernel had to be added, as neither the default Linux (“st”) nor IBM tape drivers expose these. DSS/TAB established contact with the Linux tape driver maintainer (Kai Makisara); following our request he produced a patched scsi/st driver version supporting buffered tape marks via ioctl(), by defining a new operation (MTWEOFI) for use with the immediate bit set in the 0x10 SCSI CDB. This patch has been tested on the CERN tape drives and is distributed via Scientific Linux CERN 5.
  • This new operation has also been integrated into the mainstream Linux kernels as of kernel version 2.6.37. See kernel patch details here.
  • We have modified and tested the CASTOR remote copy daemon (RTCPD) (see SVN) in order to take advantage of buffered tape marks via a non-default configuration option. If this option ( TAPE BUFFER_TAPEMARK ) is set to YES, RTCPD will buffer the two tape marks between the AUL header/payload and payload/trailer, and only write one synchronized tape mark after the file trailer. The changes required to RTCPD to use this new ioctl operation are very simple; and the CASTOR file format is not changed in any way (so backwards compatibility is fully preserved).
  • In January 2012, we have extended the buffering to span over multiple CASTOR files, and to write synchronized tape marks only after a given threshold (in terms of volume, time, or files) has been reached. Files to be migrated are buffered on the disk pools until a synchronizing tape mark is successfully written.

Performance improvements

Drive performance (as measured on T10KB drives)

Horizontal axis: file size (MB); vertical axis: drive speed (MB/s)

buftm-performance.png

  • 3 sync/file: All 3 tape marks are synchronous. (CASTOR up to 2.1.9-8).
  • 1 sync/file: Only the third tape mark is synchronous, the other two are buffered. (CASTOR, 2.1.9-9 and newer)
  • 1sync/4GB: all tape marks are buffered except a synchronizing TM every N (e.g. N=4,6,..) GB of data. (CASTOR 2.1.11 as of January 2012)

Repack performance

Drive/days to repack the CASTOR file space

image010.png

  • AUL == 3 sync/file (see above)
  • ALC == 1 sync/file
  • ALB == 1 sync/N GB

Links

Link Description
Tape_dev_performance.pptx Presentation on tape performance improvements for Repack
performance-buffered-filemarks-07032010-IBM.XLSX XLS with overall performance for IBM and Sun drives
overall-performance-gains-SUN-vs-IBM-042010.xlsx Another XLS with detailed measurements
perfomance Presentation to the CASTOR F2F meeting with other complementary performance measurements
tape labels AUL tape label specification used at CERN
SVN Code modifications to RTCPD for supporting buffered tape marks
mtweofi.patch patch to scsi/st.c for supporting buffered tape marks via a new MTWEOFI operation
kernel diff actual diff for the new MTWEOFI as it was integrated into the official Linux kernel tree

-- GermanCancio

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatpptx Tape_dev_performance.pptx r1 manage 380.9 K 2010-08-25 - 16:35 GermanCancio  
PNGpng image010.png r1 manage 27.9 K 2010-08-25 - 16:35 GermanCancio  
Unknown file formatpatch mtweofi.patch r1 manage 3.1 K 2010-08-25 - 16:58 GermanCancio  
Unknown file formatxlsx overall-performance-gains-SUN-vs-IBM-042010.xlsx r1 manage 19.0 K 2010-08-25 - 16:37 GermanCancio  
Unknown file formatXLSX performance-buffered-filemarks-07032010-IBM.XLSX r1 manage 298.7 K 2010-08-25 - 16:37 GermanCancio  
Edit | Attach | Watch | Print version | History: r17 < r16 < r15 < r14 < r13 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r17 - 2012-02-28 - GermanCancio
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    DSSGroup All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback