Data Operations: Data Transfers and Integrity

Complete: 5


Task description

Data Transfers and Integrity is one of the five subtasks of the Data Operations project. It is responsible for the timely and correct distribution of all official CMS data. Due to the very nature of the distributed CMS computing model, this task is of very high importance. The common large data transfers include:

  • from Tier0 to all Tier1
    • the raw detector data
    • the first pass reconstruction data (RECO and possibly AOD)

  • from Tier1 to Tier1
    • all AOD data created in any reprocessing effort at the Tier1 centers
    • RECO data for officially requested data duplication

  • from Tier1 to Tier2
    • AOD or RECO data on request of Tier2 center

  • from Tier2 to Tier1
    • SIM or RECO data of official Monte Carlo production

Apart from the pure distribution it has to be ensured that the transfers proceed successfully, meaning that the files are not only at the foreseen location but they are properly readable. Checksum tools can be used to ensure the integrity of the data. Those tools sometimes take a large amount of resources and it will be important to define what policy the integrity checks will be automatically run. The complex nature of the data locations makes it obvious that a large fraction of the work will also lie in the proper bookkeeping and documentation.

Before you start

There are certain requirements that operators must have before they can begin, such as obtaining certificates and login privileges . Please following the instructions on DataTransferOperatorStartingRequirements before starting.

Data Transfers and Integrity Operations

The main task for operations is to ensure that data is transferred to the desired sites in a timely fashion, and that data integrity remains consistent while it is residing at the sites. A set of instructions for completing this task is presented in the MonitoringGuide.

Here we present a dedicated Data Transfer Troubleshooting Guide, listing the possible transfer error symptoms, possible causes and solutions :

Here are a few related links:

Open Issues

Transfer Monitoring

Last 24 Hours Last 132 Hours
CERN Arrow blue right T1 Statistics Volume Queued
T1 Arrow blue right CERN
T1 [not CERN] Arrow blue right T1 [not CERN]

Links for the working expert

Web Tools Lemon SLS FTS
DBS Discovery PhEDEx SiteDB host user disk pool FTS monitor
TWiki TWiki TWiki vocms01 vocms20 cms001 phedex t0export t1transfer TWiki
  queries           default cmscaf quality map

CVS Monitors IT TWiki Other Links
PhEDEx SITECONF cacti GridView Service Status Board DDT FacilitiesOps Castor Savannah GGUS
status CCRC08

FTS monitoring

FTS server FTS Monitor
KIT FTS Monitor
RAL FTS Monitor

Data consistency

Data availability

Here you can find a brief description of the tools needed to mark blocks as inaccessible to CRAB.

Current Campaigns

Passed Campaigns

Tools for Administrators

DataOps presentations

PADA Meeting
2008 April 17
WLCG Service Reliability Workshop
2007 November 26
Data Operations Planning
2007 August 10
Data Operations Meeting
2008 April 01 08 15 22 29
March 04 11 18 25  
February 05 12 19 26  
January   08 15 22 29
2007 December 04 11 18    
November 06 13 20 27  
October 02 09 16 23 30
September 04 11 18 25  
August 07 14 21 28  
July 03 10 19 24 31
June 06 12 19 26  
May   08 15 21 29


Castor AT, castor-operations AT
CERN Users' Office
FTS fts-support AT
Operations and Problem Tracking (OPT)
PhEDEx cmslcgco03, cmsdoc AT (gilles.raymond AT
Facilities Management AT (77777)
CCRC'08 Critical Services (CASTOR+LFC+FTS) cms-operator-alarm AT

For New Sites

New Sites
set up Site Admin
registration Data Operations

Primary Dataset Tier-1 Central Distribution


Weekly Reports on Consistency

Weekly Reports on Datasets with No Replica
Weekly Reports on Datasets with No Custodial Subscription but has replicas
Cleanup of test datasets
Weekly Reports on production Datasets

Data Deletion / Deprecation

Delete 1_5 and 1_6 List
Deletion List for CSA07 datasets
Deletion List for CSA08 datasets
Deletion List for CMSSW_1_X_X datasets
Deletion List for Summer08 datasets
Deletion List for Commissioning08 processed datasets
Deletion List for Craft09 processed datasets
Deletion List for BeamCommissioning09 processed datasets
Deletion List for Beam09 Monte Carlo datasets
Deletion List for 2009 pre-production datasets
Deletion List for 2010 pre-production datasets
Deletion List for 2009 Summer09 10TeV datasets
Deletion List for 2009 Summer09 7TeV RECO and AOD, Spring 10 and Summer 10 RAW datasets from S09 production
Deletion List for rereco and skimming samples
Deletion List for Fall 2010 Raw samples
Deletion List for Fall 2008, Summer 2008, Winter 2009 MC samples
Deletion List for Summer 2009, Spring 2010, Summer 2010 GEN-SIM-RAW
Deletion List for Spring 2011 GEN-SIM
Deletion obsolete RECO samples
Deletion obsolete 8 TeV
Deletion TestEnables
Deletion List for Summer11 v1 buggy datasets
Deletion RAW2010NC
Deletion Old Relval
Deletion old MC July 2011
Deletion FNAL NC samples
Deletion 02July ReReco
Deletion Campaign for Summer 2009, Spring 2010, Summer 2010, Fall10, Winter10 MC
Deletion Campaign for Spring11 MC
Deletion Campaign for 2010-2011 data samples
Deletion Campaign for Summer and Fall 2011 GEN-SIM-RECO and RAW samples




Edit | Attach | Watch | Print version | History: r179 < r178 < r177 < r176 < r175 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r179 - 2012-10-03 - MingmingYang
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback