Personal tools
You are here: Home Documentation dCache SRM Stager
Document Actions

SRM Stager

by admin last modified 2007-11-28 02:10

This document describes how to set up the dCache SRM stager.

The dCache SRM stager is a HSM backend for dCache which "stages" files from other sites using srmcp.  This allows for seamless recovery from small amounts of file loss.  It is not scalable to recover from many terabytes of loss.

To set this up, we assume:
  1. You have access to the SSH-based admin interface
  2. You will be installing the staging on pool "dcache04_2".  dcache04_2 will be dedicated to this job and there will be no direct accesses to this pool
  3. The staging pool has outside connectivity (for best results, there should be no LAN) and a host certificate.

Configure the PoolManager

This section only needs to be done once per dCache install.  This will require a brief downtime (about 30 seconds to restart dCache central services)

The staging is controlled by the PoolManager, so we need to turn on staging on the PoolManager and tell it which pools to use.

First, set up a new staging group and remove the staging pool from the default pool group:

psu create pgroup stagePools
psu addto pgroup stagePools dcache04_2
psu removefrom pgroup default dcache04_2

If you have any other pool groups that dcache04_2 is in which might cause a transfer to access that directly, remove dcache04_2 from them also.  Now, create a link for staging:

psu create link stage-link any-protocol any-store world-net
psu set link stage-link -readpref=0 -writepref=0 -cachepref=10 -p2ppref=-1
psu add link stage-link stagePools

If you have any other links which can stage files, remove these:

psu set link default-link -readpref=10 -writepref=10 -cachepref=0 -p2ppref=-1

Finally, we turn on staging in the pool manager:

rc set stage oncost off
rc set stage on
rc set max restore 500

Now, type save so the configuration will persist upon restart.  You will need to restart the PoolManager to enable staging.


Configure proxy management

In order for the stager to be able to stage files from remote sites, it will need to be able to authenticate using a proxy.  This section needs to be done only once per site.  We will accomplish this by setting up a dedicated MyProxy server.

The easiest way to do this is to use the myproxy server from the globus install.  Write the following file to $GLOBUS_LOCATION/etc/myproxy-server.conf:

accepted_credentials       "*"
authorized_retrievers      "/DC=org/DC=doegrids/OU=Services/CN=dcache*.unl.edu"
trusted_retrievers         "/DC=org/DC=doegrids/OU=Services/CN=dcache*.unl.edu"

There are many more options for the myproxy configuration, but we have stripped this down to the bare minimum.  Make sure you update the authorized_retrievers and trusted_retrievers to reflect the syntax of your staging server's host certificate.  Notice that many host certificates will have CN=host/<hostname>, so you might need to pay close attention to your wildcard usage!  Start the myproxy server:

source /opt/osg/osg-060/setup.sh
myproxy-server &

I recommend to add those two lines (adjusting the setup script for your middleware as necessary) to the rc.local on the myproxy server.

Next, create a very long lasting proxy on this server:

myproxy-init -s osg-test3.unl.edu -l brian -c 4320

That will create a 6-month proxy on server osg-test3.unl.edu for user brian.  On the staging pool, test and make sure the root user can retrieve this proxy without a password:

# myproxy-get-delegation -s osg-test3.unl.edu -l brian -n
A credential has been received for user brian in /tmp/x509up_u0.

Configure your staging pool

If you have not done so already, add the new staging pool to the stagePools group in the PoolManager and remove it from the default pool:

psu addto pgroup stagePools dcache04_2
psu removefrom pgroup default dcache04_2

Log into the admin interface for the staging pool.  Type the following commands:

rh set max active 2
rh set timeout 14400
hsm set osm -command=/root/projects/dCacheNebraska/scripts/fake_stager.py

Type save to make the configuration persist after pool restart. 

If you are using the dCache PFM, add the staging pool to the blacklist (otherwise, the PFM will try to write replicas to this pool!)

To download the fake_stager.py script, do the following (as root):

mkdir ~/projects
cd ~/projects
svn co svn://t2.unl.edu/brian/dCacheNebraska

You will need to edit a few files in the scripts directory to reflect the fact you aren't running at Nebraska:

  • In fake_stager.py, edit the values of SRMCP, SOURCE_SITE, and DEST_SITE.  The SITE names should be the PhEDEx site name.
  • In srmcp_stager.py, edit the values of OSG_SETUP, SRMCP, COPY_CMD, MYPROXY_GET_DELEGATION, and GRID_PROXY_INFO:
    • OSG_SETUP: the command needed to configure the middleware.  Does not have to be OSG.
    • SRMCP: the copy command (defaults to srmcp).  Must be a binary - do not include any flags.
    • COPY_CMD: any additional command line parameters for SRMCP
    • MYPROXY_GET_DELEGATION: A command which will create a new proxy for this node.
    • GRID_PROXY_INFO: A command which, if it exits with code 0, means that there is a valid proxy on this system.
If you are running this script at Nebraska, none of the above values need to be changed.

Finally, you will need to set up the DBParam file.  This file contains the necessary login parameters so the stager can talk directly with dCache.  The file's format is documented here.  This must be placed in the home directory of the user dcache runs as (usually root).

Testing your setup

Testing is easy.  First, remove any proxies on the stager pool (we want to test the proxy delegation method).  Find a CMS file (which is available at your SOURCE_SITE) which only exists on some pool on your system.  Shut that pool down and try to access that file via dccp.  When you do this:
  • The PoolManager ought to show the request in "rc ls" and show it as staging
  • If you have the printout set to 3 on the stager pool, you ought to see the call to fake_stager.py appear in the logs.
  • You should see the srmcp process running on the stager system.  Eventually, it will start a globus-url-copy process.
  • Once the srmcp process has successfully completed, you should see this reflected in the pool's logs
  • The PoolManager will now start a Pool2Pool copy of the file to one of your read pools.
  • Finally, your dccp process should start downloading data.
Of course, any of the above steps may go wrong - good luck!  We hope to include some troubleshooting information as we get help requests from users :)


Powered by Plone, the Open Source Content Management System