5.12 cmssh tutorial

Complete: 5
Detailed Review status

Goals of this page:

This page is intended to provide you with an overview of cmssh shell, its installation and usage in CMS.

Contents

Introduction

CMS software is quite complex. It requires certain knowledge to install and run it properly. Basically it consists of GRID middleware to perform grid tasks, such as job submission, file transfer, etc., the CMSSW software stack to run CMS software and various web-services to help users to find out their data. The cmssh project was developed to simplify the initial burden on end-users to install CMS software. It targeted the following items:

  • users should be able easily find their favorite data
  • users should be able to copy files transparently to/from SE/local disks without any knowledge of GRID middleware
  • users should be able to easily install and run CMSSW releases without any knowledge of CMS packing and distribution tools
  • users should be able to perform all of these tasks, including analysis ones, under simple user-friendly shell
The idea was to bring users a shell, like the one you use on any UNIX platform, which will allow to perform aforementioned tasks.

What is it?

cmssh is programmable shell written in python (more precisely in IPython). It means that you can program in your shell using python language. You can do any python operations, e.g. assignment, functions, loops, conditions, etc. At the same time it works as a normal UNIX shell, e.g. bash, where commands like cp, mv, mkdir, rmdir just works. Moreover those commands will work with local files as well as with CMS objects, such as LFNs. In other words cp command will work transparently if you'll give to it local file or LFN and it will be able to copy local file or LFN to your destination (either local disk or remote storage element). Are you excited that you can find your data and copy LFNs to any place in a world under your fingertips? If so, please read on.

Why should I use it?

Well, your daily tasks include (among others): find CMS data, e.g. datasets, run, LFNs (probably via DAS), get LFN via FileMover, run cmssh software. What if all of these tasks you can do in your shell, on your Mac laptop, without any limitations and using normal UNIX syntax, e.g.

find dataset=*Zee*
cp /store/data/CRUZET3/Cosmics/RAW/v1/000/050/832/186585EC-024D-DD11-B747-000423D94AA8.root .
cmsrun my.cfg

you probably need to do these tasks hundreds of times, right? If your answer is yes your probably would be curious to know that cmssh can be a rescue, give it a shot and you'll not be disappointed.

Installation

The installation of cmssh is quite simple. You need to download installer from the web and run it in your environment. But there are two routes: the stand-alone installation mode, e.g. your laptop, and multi-user and/or system-wide mode where CMSSW software is available, e.g. lxplus or your local cluster. We will discuss both of them in the following sections. Let's start with getting the installer. On your UNIX box (Linux, Mac OS X, etc.) you can get it either via curl, wget or simply download it from the web. Let's outline curl and web approach

Download cmssh installer using curl tool

curl -k https://raw.github.com/dmwm/cmssh/master/cmssh_install.py > cmssh_install.py

Download installer script directly from the web

Just point your browser over here and save the script under cmssh_install.py name.

Now, go to some place where you'd like to install this tool, e.g. $HOME/workspace/public on lxplus or $HOME/work on your laptop. Feel free to create any directory you want where you'll place installer and install cmssh. So we're ready to go, but before that let's outline a few prerequisites.

Prerequisites

The cmssh installer requires python version 2.6 or above (but not 3.x yet). You can check your python version as simple as

python -V

On lxplus you can use python26, since default version of python is very ancient, it is 2.4 or use one of the python installed within CMSSW, e.g.

source /afs/cern.ch/cms/slc5_amd64_gcc462/external/python/2.6.4-cms/etc/profile.d/init.sh 

On Mac OS X you'll need to install Xcode. Please obtain it from Apple Store and install it on your system. It basically installs gcc and other stuff which are required to compile and handle your code.

You can install cmssh under Linux SLC5 or compatible. The other Linux distribution may or may not work (the limitation actually comes from CMSSW software stack rather then cmssh itself). But due to their broad variety I was unable to test it on all Linux distribution. Feel free to try it out and if it fails please submit bug report with full output description as well as name and version of your Linux distribution. This will allow us to install virtual machine with your Linux distribution and debug your problem.

Stand-alone installation (laptop mode)

For this install scenario you'll run cmssh installer as simple as

python cmssh_install.py --install --dir=$PWD

If you ever need to debug its output you can add -v 1 option. For more options just run

python cmssh_install.py --help

After installation step is done you'll get a nice message where new cmssh tool is located, e.g.

...
Create vomses area
Create cmssh
Clean-up soft area
Congratulations, cmssh is available at /afs/cern.ch/user/v/valya/workspace/public/soft/bin/cmssh

At this step we're ready to go.

Multi-user mode with existing CMSSW install area

First you need to decide which CMSSW architecture you'll use. For list of available architectures please use TagCollector service. Here I'll use slc5_amd64_gcc462 as an example. I'll also use /afs/cern.ch/cms as a top area where CMSSW software is located. You'll need to adjust those settings to the ones found on your system. Here I show example of how to install cmssh on lxplus

python cmssh_install.py --install --dir=$PWD --arch=slc5_amd64_gcc462 --cmssw=/afs/cern.ch/cms --multi-user

Usage

To use cmssh simply invoke it from your shell, e.g.

my-computer# /path/soft/bin/cmssh

here my-computer# is a UNIX shell prompt followed by /path as a PATH where you install cmssh (in an example above it was show in a line Congratulations, cmssh is available at /afs/cern.ch/user/v/valya/workspace/public/soft/bin/cmssh, so the /path was /afs/cern.ch/user/v/valya/workspace/public/).

Upon start-up the cmssh will verify your GRID certificate, if found (under $HOME/.globus) it will verify permissions of your userkey.pem and usercert.pem and ask your GRID password (if you have one). If everything goes smoothly it will invoke proper voms command to get your proxy setup (at this point you'll see your normal proxy output with your DN etc). Once it is started you'll get the following screen:

Available cmssh commands:
find         search CMS meta-data (query DBS/Phedex/SiteDB)
dbs_instance show/set DBS instance, default is DBS global instance
mkdir/rmdir  mkdir/rmdir command, e.g. mkdir /path/foo or rmdir T3_US_Cornell:/store/user/foo
ls           list file/LFN, e.g. ls local.file or ls /store/user/file.root
rm           remove file/LFN, e.g. rm local.file or rm T3_US_Cornell:/store/user/file.root
cp           copy file/LFN, e.g. cp local.file or cp /store/user/file.root .
info         provides detailed info about given CMS entity, e.g. info run=160915
das          query DAS
das_json     query DAS and return data in JSON format
dqueue       status of download queue, list files which are in progress.
root         invoke ROOT
du           display disk usage for given site, e.g. du T3_US_Cornell

Available CMSSW commands (once you install any CMSSW release):
releases     list available CMSSW releases, accepts <list|all> args
install      install CMSSW release, e.g. install CMSSW_5_0_0
cmsrel       switch to given CMSSW release and setup its environment
arch         show or switch to given CMSSW architecture, accept <list|all> args
scram        CMSSW scram command
cmsRun       cmsRun command for release in question

Available GRID commands: <cmd> either grid or voms
vomsinit     setup your proxy (aka voms-proxy-init)
vomsinfo     show your proxy info (aka voms-proxy-info)

Query results are accessible via results() function:
   find dataset=/*Zee*
   for r in results(): print r, type(r)

Help is accessible via cmshelp <command>

To install python software use pip <search|(un)install> <package>

cms-sh|1> 

I hope that output explains itself. You got set of command examples, their description and cmssh prompt. Under the cmssh prompt you can start placing your normal commands, like ls, cp, mkdir, etc. In addition you can use all listed commands, e.g. find. At the end you got a cms-sh|1> prompt which shows that cmssh is ready for its first command. Once you'll start placing commands the number will be incremented accordingly to keep track of your commands which can be used later, e.g. for re-play or reference. For example

cms-sh|1> ls
soft
stuff
tests

cms-sh|2> a=1

cms-sh|3> print a
1

cms-sh|4> 

here I run simple ls command to list files in my local directory, then I made a=1 assignment and print it out. Remember cmssh is a python shell, all python command will work, e.g.

cms-sh|4> import os

In the examples above, you can notice that number in cmssh prompt is incrementing. It shows which command you execute. Later it can be used to re-play your history, etc. Without further due, I provide a simple set of commands which you can execute under cmssh shell and get the felling what it can do:

# search for some data
find dataset=*CRUZET3*RAW
for r in results(): print r, type(r)

# info about file/dataset/run
ls /Cosmics/CRUZET3-v1/RAW
info /Cosmics/CRUZET3-v1/RAW

find file dataset=/Cosmics/CRUZET3-v1/RAW
find site dataset=/Cosmics/CRUZET3-v1/RAW
find run=160915
info run=160915
for r in results(): print r.initLumi, type(r.initLumi), r.DeliveredLumi, type(r.DeliveredLumi)

# list/copy LFN to local disk
ls /store/data/CRUZET3/Cosmics/RAW/v1/000/050/832/186585EC-024D-DD11-B747-000423D94AA8.root
cp /store/data/CRUZET3/Cosmics/RAW/v1/000/050/832/186585EC-024D-DD11-B747-000423D94AA8.root .
ls -l

# SE operations, e.g. list its content, create/delete directory, etc.
du T3_US_Cornell
ls T3_US_Cornell
ls T3_US_Cornell:/store/user/valya
mkdir T3_US_Cornell:/store/user/valya/foo
ls T3_US_Cornell:/store/user/valya
rmdir T3_US_Cornell:/store/user/valya/foo
ls T3_US_Cornell:/store/user/valya

# copy local file to SE
cp 186585EC-024D-DD11-B747-000423D94AA8.root T3_US_Cornell:/store/user/valya
ls T3_US_Cornell:/store/user/valya
ls -l
rm 186585EC-024D-DD11-B747-000423D94AA8.root

# copy LFN from SE to local disk
cp T3_US_Cornell:/store/user/valya/186585EC-024D-DD11-B747-000423D94AA8.root .
ls -l

# delete file on SE
rm T3_US_Cornell:/xrootdfs/cms/store/user/valya/186585EC-024D-DD11-B747-000423D94AA8.root
ls T3_US_Cornell:/store/user/valya

# copy LFN to SE area
cp /store/data/CRUZET3/Cosmics/RAW/v1/000/050/832/186585EC-024D-DD11-B747-000423D94AA8.root T3_US_Cornell:/store/user/valya
ls T3_US_Cornell:/store/user/valya
rm T3_US_Cornell:/xrootdfs/cms/store/user/valya/186585EC-024D-DD11-B747-000423D94AA8.root
ls T3_US_Cornell:/store/user/valya

# copy multiple files
cp /store/data/CRUZET3/Cosmics/RAW/v1/000/050/832/186585EC-024D-DD11-B747-000423D94AA8.root . &
cp /store/data/CRUZET3/Cosmics/RAW/v1/000/050/796/4E1D3610-E64C-DD11-8629-001D09F251FE.root . &
dqueue

# copy user file from T1 tier
cp T1_US_FNAL_Buffer:/store/user/neggert/TT_TuneZ2_7TeV-mcatnlo/MCTSusy_Skim_Mar2012/7b5af1bfe3424f60f0db5b5f14cf327a/MCTSusySkimMar2012_591_1_cSX.root .

# copy lfn from SE to SE
cp T1_US_FNAL_Buffer:/store/user/neggert/TT_TuneZ2_7TeV-mcatnlo/MCTSusy_Skim_Mar2012/7b5af1bfe3424f60f0db5b5f14cf327a/MCTSusySkimMar2012_591_1_cSX.root T3_US_Cornell:/store/user/valya

# look-up available releases
releases

# install CMSSW release
install CMSSW_5_0_1

# switch to installed release
cmsrel CMSSW_5_0_1

# run cmsRun job
cmsRun runevt_cfg.py

# usage of magic functions
# show how to access docstrings
edit test.py

ip = get_ipython()
ip.magic_find("dataset=*Zee_M20*")
for r in results(): print r, type(r)

Commands

The cmssh is a programmable shell written in python. It means that you can program anything using python language. For example, let's perform simple tasks

cms-sh|5> import os

cms-sh|6> for k, v in os.environ.items(): print k, v
_ /Users/vk/CMS/test_cmssh/soft/install/bin/ipython
....

Here we import os python module and made a simple loop to look-up environment variables. Pretty neat. But cmssh can help you more with python, e.g. if you'll do

cms-sh|7> os.walk?

and hit return it will list all documentation about os.walk function from os python module. if you'd like to see its code you can do the following:

os.walk??

But what if you don't know which functions/methods are available in your python module. In this case just use tab completion. For example type os. and hit the tab, you'll get a full list of functions available in os python module. Here is what I did

cms-sh|7> os.
Display all 202 possibilities? (y or n)
os.EX_CANTCREAT      os.WNOHANG           os.geteuid           os.sep
os.EX_CONFIG         os.WSTOPSIG          os.getgid            os.setegid
os.EX_DATAERR        os.WTERMSIG          os.getgroups         os.seteuid
...

I hope you'll enjoy this feature. It works with system modules or your local ones once you import your python code.

Meanwhile, there are two types of help you can get from cmssh, the python help is available as

help(os)

where you provide some python module you want to get help with, in this case it was os module. And the second help is cmssh specific one which you'll get by using cmshelp command, e.g.

cmshelp find

In this example, we invoked cmssh help for find command. There are much more power under cmssh which you can imaging. The list of available commands is available if you type lsmagic. All the command started with percentage can be used directly in your shell, e.g. find, grep, mkdir, etc. The cell magic commands are the ones which will allow you to place code snippets underneath of the command. Let's explore how you'll execute series of commands under your usual UNIX shell

lsmagic
.... # here you'll see list of all magic commands
# for demonstration I'll use %%! command

cms-sh|4> %%!
     ...: hostname -f
     ...: pwd
     ...: ls | wc
     ...: 
  Out[4]: 
['mr46.lns.cornell.edu',
 '/Users/vk/CMS/test_cmssh',
 '       6       7      56']

So what happened here? I invoked the cell magic command which allows to run series of commands under my UNIX shell. Then I typed three commands, hostname -f, pwd and ls | wc and hit enter. The output is a python list object which contains a list of outputs from my UNIX shell commands.

At this point it is up to you to explore all available commands. Go for it!

Advanced features

The cmssh is very powerful tool. It can do the following tasks:

  • find CMS data
  • copy any LFN from/to local disk/remote storage element
  • you can program any python code and run it right away
  • you can install any python package in your local install area
  • you can use matplotlib/numpy/ROOT packages
  • you can use R statistical language if it is installed on your system
  • you can run cmssh in notebook mode
  • your imagination should never stop under cmssh, since it allows you to program and utilize python in its full power
Here I'll discuss only a few of the topics listed above. How to install python packages and notebook feature.

Install 3d party python packages

You probably heard about PyPI, right? Shortly, it is python repository of python packages. Under cmssh you can do the following ( please note that this will work only if you are the owner of your cmssh installation, all UNIX ownership still applies):

cms-sh|5> pip search simpleyaml
simpleyaml                - YAML parser and emitter for Python


cms-sh|6> pip install simpleyaml
Downloading/unpacking simpleyaml
  Running setup.py egg_info for package simpleyaml
Installing collected packages: simpleyaml
  Running setup.py install for simpleyaml
Successfully installed simpleyaml
Cleaning up...


cms-sh|7> import simpleyaml

cms-sh|8> s="""
     ...: name: foo
     ...: type:  
     ...:     - int
     ...:     - float
     ...:     """
     ...:     

cms-sh|9> simpleyaml.load(s)
  Out[9]: {'name': 'foo', 'type': ['int', 'float']}

Here I did a few steps. I searched for a package called simpleyaml, I installed this package under my cmssh and I imported this package right away into cmssh. Then I created my string and loaded it via simpleyaml to get python dict. Very simple and very powerful approach. You can search and install any python package and start using it right away.

cmssh notebook

You can run cmssh under your browser. You may wonder why do I need that? Imaging that you're doing some project. You probably will run quite a lot of commands, create code, make plots, etc. What if you want to bookkeep all your steps? Make annotations, comments, plots. And you want to re-play all your results back or better you want to send them over to someone else without explaining how to you did all steps. This is the use case for cmssh notebook. So you can invoke it as simple as

cmssh notebook

For details I refer you to watch this video.

More information

Review status

Reviewer/Editor and Date (copy from screen) Comments
JohnStupak - 11-September-2013 Review

-- ValentinKuznetsov - 13-Jul-2012

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf cmssh.pdf r2 r1 manage 14196.4 K 2012-10-30 - 16:47 ValentinKuznetsov Cmssh presentation
Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2013-09-18 - DanielHuizenga
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback