Generalities
LcgCaf is a purely LCG/gLite based dCAF which allows CDF users to access Production GRID resources performing the submission via a gLite Workload Management System. It has been created in order to provide to CDF users a Monte Carlo Production dedicated farm outside Fermilab. LcgCaf doesn't support Data Analysis by now.
The LcgCaf is in production now and the Head node is in Padova. Here is a short description about how to use the farm, in particular how to deal with the new GRID environment.
Read the LCG Use Policy
We managed to provide all the required credential to allow CDF user run on the LCG/gLite resources. Before submit any job to LcgCaf you need to read and agree the
LCG User Acceptable Use Policy
Available resources
In principle LcgCaf can access all Production INFN GRID resource such as LHC Tier1 and Tier2 farms around Italy and main sites in Europe. As today LcgCaf is configured to allow submission in the following set of well controlled and tested sites:
| Site | Computing Element |
| CNAF-T1 | ce02-lcg.cr.cnaf.infn.it |
| FZK-LCG2 | ce-fzk.gridka.de |
| IN2P3-CC | cclcgceli02.in2p3.fr |
| INFN-PISA | gridce.pi.infn.it |
| INFN-PISA | gridce2.pi.infn.it |
| INFN-PADOVA | prod-ce-01.pd.infn.it |
| INFN-CATANIA | grid012.ct.infn.it |
| INFN-LNL-2 | t2-ce-02.lnl.infn.it |
| INFN-BARI | gridba2.ba.infn.it |
| INFN-Roma1 | t2-ce-01.roma1.infn.it |
| INFN-Roma2 | grid003.roma2.infn.it |
| IEPSAS-Kosice | ce-iep-grid.saske.sk |
Job queues and groups
LcgCAF has only common group, since priority is not yet implemented.
Please remember to set the correct estimated duration (queue) of the job, selecting from these options:
| Queue | Duration (hours) |
| short | 6 |
| medium | 30 |
| long | 72 |
Please note that not all grid sites support all type of jobs; selecting the appropriate queue will let you access more resources.
Submission step by step
-
A new service principal is needed for LcgCaf: <user>/cdf/lcgcaf@FNAL.GOV. You must include it on your .k5login in order to allow the copy of the output files in your kerberized storage locations. Your .k5Login file should be something like the following
pagan@FNAL.GOV
caf/cdf/pagan@FNAL.GOV
pagan/cdf/lcgcaf@FNAL.GOV
-
The .cafrc that you get with the current offline version of development has everything in place for LcgCAF. It should look like:
;
; Global definitions
;
[global]
; List of available analysis farms and default choice
af_list=caf,cafcondor,cnaf,cancaf,jpcaf,knu-mc-only,mitcaf,sdsccaf,
rutgers,torcaf3,lcgcaf
;
[lcgcaf]
; A LCG Grid Middleware based dCAF
;
; Headnode name, submitter port, and service/host principal
host=pcdf11.pd.infn.it
submitter_port=8023
monitor_port=8120
groupupdate_port=8140
icaf_port=9135
svc=host
;
; List of available process types and default choice
pt_list=short,medium,long,test
default_pt=short
use_new_format=true
Now you are ready to submit.
If you use tcsh
source /afs/infn.it/project/cdf/cdfsoft/cdf2.cshrc
(or source ~cdfsoft/cdf2.cshrc if available)
setenv CAF_CURRENT lcgcaf
setenv OUT "pagan@pcdf6.pd.infn.it:/mydisk/mydir/output"
setenv sections '1-5'
CafSubmit --tarFile=tar.tgz --outLocation=${OUT}/simple_\$.tgz \
--dhaccess=None --group=common --procType=medium \
--email=pagan@pd.infn.it --sections=${sections} ./run.sh $
If you use bash
source /afs/infn.it/project/cdf/cdfsoft/cdf2.shrc
(or source ~cdfsoft/cdf2.shrc if available)
export CAF_CURRENT=lcgcaf
export OUT "pagan@pcdf6.pd.infn.it:/mydisk/mydir/output"
export sections '1-5'
CafSubmit --tarFile=tar.tgz --outLocation=${OUT}/simple_\$.tgz
--dhaccess=None --group=common --procType=medium
--email=pagan@pd.infn.it --sections=${sections} ./run.sh \$
Group accounting is not been implemented in LcgCaf. By now just only the common queue exits. Your job will run up to a week (i.e. the GRID proxy duration)
Submission Script
source $CDFSOFT/cdf2.shrc
setup cdfsoft2 6.1.4
myprogram.exe mytcl.tcl >& mylog.log
Where to store the output
You can copy your output in any location you are allowed to as CDF user with the following command:
setup fcp
fcp -c $KRB5BIN_DIR/rcp -x -N ./testfile pagan@fcdflnx2.fnal.gov:/cdf/spool/pagan/test_file
You can also copy your output in a GRID Storage Element. At CNAF we have one CDF SE grid007g.cnaf.infn.it that can be used. If you want to use this space please ask Simone or Gabriele.
The command to use it is:
$GLOBUS_PATH/bin/globus-url-copy file://`pwd`/test_file \\
gsiftp://grid007g.cnaf.infn.it/flatfiles/SE00/cdf/test_file
Monitoring
You can monitor your job as in the usual way: the LcgCaf
Web Monitor is at
http://webserver.infn.it/cdf/lcgcaf/user.html
The Interactive Monitoring by CafMon has been implemented, not so well performing as the old one, but in any case it it is there with these commands:
CafMon list CafMon kill
CafMon dir CafMon tail
CafMon ps CafMon top
By default, the content of the log files job_X.err and job_X.out (X=number of section), which store the stdout and atderr of the running job, are available for displaying by the use of the Interactive Monitor. If you want to access other sensible log files (ex. you MC log files), please just create a file called wncollDaemon.conf in the starting directory of your job and put inside this file a list of files that you want to be able to access from the Interactive Monitor (one file per line, as usual '$' is automatically replaced with the number of the running section), as the following example:
myMClogfile_$.log
my_second_MC_logfile_$.log
Remarks
Some grid sites have SL4 nodes. We found some problems when trying to compile your own root macros. Please submit your jobs with root macros already compiled on a SL3 machines and you will not have any problem.