NikhefDiskResources

This is the LHCb Nikhef group page describing locally available disk space


Old tape drives ;) --> New tape

Where to store what?

Where What? Size Backed-up Staging? path
Nikhef home Directory Anything you like, it's your home directory 1GB Yes   ~
Afs home Directory Ganga job repository, critical code you're working on, all other small files 10GB Yes   /afs/cern.ch/user/<a>/<another>
Afs scratch directory Ganga job workspace, temporary ntuples, noncritical code >1GB No   /afs/cern.ch/user/<a>/<another>/scratch*
Afs work Directory Ganga workspace, temporary files downloaded from the grid, large compiled binaries 100 GB No   /afs/cern.ch/work/<a>/<another>
Local Disk It is not advisable to keep anything on your local disk, but you manage this yourself >100 GB No   /tmp/
Group Data Ganga job workspace, temporary ntuples, temporary DSTs 10TB No   /data/bfys/
Group Project Ganga job repository, critical code, personal recompilations of software. 2TB Yes   /project/bfys/
User castor backup tars, large ntuples to share, old ntuples ??TB Yes Tape /castor/cern.ch/user/<a>/<another>
Grid castor Ntuples from the grid, Selected DSTs 2TB Yes Disk /castor/cern.ch/user/grid/lhcb/user/<a>/<another>

  • The non-backed up spaces are volatile, do not use them for critical items.
  • The individual elements are described below.

Personal Nikhef Resources:

  • Home directory
    • The default quota for your home directory is 1GB, of whatever you want to store there
    • To see your current usage, call quota
    • /user/<uname>

  • Local Disk
    • Different sizes for different machines.
    • In general we do not recommend to use the local disk, heavily although you have a lot of space, it is tough to collaborate, and is generally not backed up
    • Mostly this means using for temporary files, storing in /tmp/<uname>

Personal CERN Resources:

  • What is AFS?
    • "Andrew File System,"
    • a distributed, authenticated, globally accessable, Posix-compliant file system
    • the chosen technology for most disk space at CERN
    • Afs is split into nodes by site, and is authenticated using the kerberos tokens system
    • to authenticate to Afs use kinit, see NikhefLocalSoftware

  • Afs Home directories:
    • All CERN users can get a 10GB Afs home directory, sans approval here
    • All CERN users can request additional scratch space (not backed up) softlinked from their home directory, by contacting Joel Closier.
    • Your Afs space home directory is for code development, and storing documents, ntuples, grid jobs, that you want backed up, but are not massive files.
    • fs listquota to get the quota/usage, use separately in each scratch space
    • /afs/cern.ch/user/<u>/<uname>

  • Afs Work directories:
    • All CERN users can get a 100GB Afs work directory, sans approval here
    • Your Afs space work directory is for code development, and storing documents, ntuples, grid output, that you do not need want backed up, they can be massive files.
    • fs listquota to get the quota/usage, use separately in each scratch space
    • /afs/cern.ch/work/<u>/<uname>

  • What is Castor?
    • *C*ern *A*dvanced *STOR*age manager.
    • A non-posix-compliant, "remote" file storage system
    • Authenticated through many possible authenitcation layers "protocols", including Kerberos for users at CERN
    • A long-term, mass file storage system and management system
    • The chosen technology for mass storage at CERN and some other grid sites
    • Directly accessible in Root using the castor: or xrootd root: protocols.
    • CASTOR homepage

  • Castor Space:
    • All CERN users have a castor home directory, where you can store almost anything you like, put things here if they cannot fit on your Afs dir, and need it to be backed up for a long time.
    • After a while files will be stored on tape, and take some time to stage.
    • /castor/cern.ch/user/<u>/<uname>

Grid Resources:

  • Grid-USER:
    • LHCb has a large amount of disk space shared between all collaborators at each site
    • your quota is around 2TB, and you should clean it regularly following the instructions here: GridStorageQuota
    • It is permanately staged to disk.
    • Use it to store large ntuples and DSTs that you got from the grid, which you want to share with others in the collaboration and which you don't want to dissapear back to tape.
    • At cern your Grid user will be stored at /castor/cern.ch/grid/lhcb/user/<u>/<uname>

Group Resources:

  • /data/bfys
    • The Nikhef BFys group has access to a large amount of disk space shared between all BFys collaborators, mounted on Nfs
    • There is no quota system, but it's a fair-use policy, so you'll probably be asked to cut down if you start eating it up.
    • use df -h /data/bfys to see the remaining and total space
    • Make your own subdirectory probably with your username to store data under
    • Use this for ntuples, and copies of small amounts of data which cannot be analysed on the grid
    • If the data are already on the gird, you can replicate the files to SARA (see below) you do not have to copy them to /data/bfys unless the software you are using cannot handle the dcap protocols, all root based software can
    • This is the ideal place for your ganga workspace, but not your ganga job repository. To achieve this you need to use softlinks.

  • /project/bfys
    • This is where we store copies of a certain small portion of software, this location is backed up
    • there is no quota system, but it's a fair-use policy, so you'll probably be asked to cut down if you start eating it up.
    • use df -h /project/bfys to see the remaining and total space
    • Make your own subdirectory probably with your username to store data under
    • Use this for non-volatile files, or small volatile files. Since this resource is backed up, it is a problem if you start fast write/read/delete cycles with large files
    • This resource is not really for storing large files, since the size is limited, we use the Grid and /data/bfys for that
    • It's the ideal place to store your ganga repository, but NOT your workspace. To achieve this you need to use softlinks.

Local SARA Grid Storage

  • LCG.SARA.nl has a reasonable amount of storage space for LHCb, it is accounted against your grid quota as given above.
  • SARA uses a dcache for storage
  • Note that it is simple to copy grid data to SARA and simple to access it locally, you do not have to copy it locally unless you really want to.

Storing data at SARA

  • For datasets replicated elsewhere, call dataset.replicate("SARA-USER") to get a copy, from within Ganga
  • To automatically keep at least one copy of your output at SARA add to your .gangarc
[LHCb]
DiracOutputDataSE = [ 'SARA-USER' ]

Copying data to SARA

  • For datasets replicated elsewhere, call dataset.replicate("SARA-USER") to get a copy, from within Ganga
  • If the file is not yet on the grid, you can upload it to the grid with PFN(filename).upload() which will return an LFN you can replicate()
  • If the file is on Castor at CERN but not onthe grid, you can declare it to the LFC, this effectively putting it on the grid, by calling lhcb-giridfy-castor-file (or something like that) in the LHCbDirac environment.

Accessing SARA storage directly

  • You can directly access files at SARA through dcap, (dcache access protocol - unautenticated) here on the nikhef network
  • To use this directly within Root, LHCb software or Ganga, you need to do two things
    1. load the dcache libraries
    2. convert LFNs to PFNs

  • Using Ganga
    • When you call SetupProject Ganga, add the libraries with: SetupProject Ganga ROOT --use-grid --use dcache_client\ v\*\ LCG_Interfaces
    • This is aliased to GangaNikhef when you use our local setup scripts!!
    • You could also add your own alias in your .bashrc
    • Conversion from LFN to PFN is not needed in Ganga, Ganga will do that for you if you submit LFNs

  • Using Root
    • Once you have the PFns from the book-keeping, you can open them directly inside ROOT
    • SetupProject Gaudi ROOT --use-grid --use dcache_client\ v\*\ LCG_Interfaces
    • TFile::Open(dcap:)
    • This is internally using the same system as the rest of our software.
    • Putting all your OutputData at SARA can then save you on disk space (but it's not very limited anyway on the data drive), and download time.

-- RobLambert - 24-Oct-2011

Topic attachments
I Attachment History Action Size Date Who Comment
JPEGjpg old_tape.jpg r1 manage 51.9 K 2011-10-24 - 12:58 RobLambert Tape Banks from the dark ages
JPEGjpg storagetek.jpg r1 manage 10.3 K 2011-10-24 - 12:59 RobLambert Modern storage solution
Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r11 - 2013-08-06 - RobLambert
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback