Personal tools
You are here: Home Documentation Draining pools and the Replica Manager
Document Actions

Draining pools and the Replica Manager

by admin last modified 2006-05-09 18:12

This document goes over how to set up the replica manager and includes a recipe for draining off a pool containing precious files.

The point of Resilient dCache and the Replica Manager (also called the Resilience Manager) are to keep a specified number of copies of files on designated pools.  Why would you want to do this?  Two reasons come to mind:
  • Keeping the number a pools a single file is on to a minimum.
  • If a site has no mass storage system (like USCMS T2's), then one may want to force duplicates of files so no files are lost if a disk / pool fails for some reason.
  • This can be used as a clever way to drain a pool of all its files; see the section at the end.
If one of these three reasons intrigues you, better documentation can be found at the official website.  This webpage is meant to document how to get the Replica Manager running, rather than how it works.

Installation and configuration:

  1. Make sure you have a recent version of the dCache JARs.  Contact the people you get dCache from for a link.
  2. You need to create a batch file so dCache knows how to start the Replica Manager.  It should be placed in /opt/d-cache/config/replica.batch. Below is what I have at my site:
    set printout default 3
    set printout CellGlue none
    onerror shutdown

    check -strong setupFile
    copy file:${setupFile} context:setupContext
    import context -c setupContext
    check -strong serviceLocatorHost serviceLocatorPort

    create dmg.cells.services.RoutingManager RoutingMgr
    create dmg.cells.services.LocationManager lm \
    "${serviceLocatorHost} ${serviceLocatorPort} "

    create diskCacheV111.replicaManager.ReplicaManagerV2 replicaManager \
    "default \
    -export \
    -dbClass=diskCacheV111.replicaManager.ReplicaDbV1 \
    -configDirectory=${config} \
    -hotStart \
    -maxWorkers=10 \
    -min=2 -max=3 \
    -resilientGroupName=ResilientPools \
    -waitDBUpdateTO=60 \
    -jdbcUrl=jdbc:postgresql://<DB hostname>/<DB name> \
    -jdbcDriver=org.postgresql.Driver \
    -dbUser=<Username> \
    -dbPass=<Password> \
    -debug=true \
    "
    Some notes about this configuration:
    • "set printout default 3" makes for fairly verbose logs, as well as "-debug=true" in the options.  You might want to set printout to 2 and debug to false once things are working.
    • maxWorkers is the max number of copy jobs that the Replica Manager will start at once.  10 might be too high or too low depending on how fast the node is and how many files need to be replicated.
    • The options -min and -max define, respectively, the minimum number of copies of a certain file that can exist and the maximum number of copies the Replica Manager can make.  Choose these parameters according to your needs
    • jdbcUrl, dbUser, and dbPass should be set according to your local site parameters.  See the next step for how to make the database.
    • I use the option "hotStart", which skips a couple of lengthy sanity checks.  If your Replica Manager is acting funny, try using "coldStart" instead, and give the Replica Manager 10 - 15 minutes to start up.
    • resilientGroupName should be set to a certain Pool Group at your site; we discuss this momentarily.
  3. The Replica Manager keeps track of all the replicas in a Postgres database; it's your job to make sure the database exists beforehand, and has the correct schema.  In order for recent versions of dCache to work properly, you need a working Postgres database anyway, so you're halfway there; make sure you've followed the official documents (link to dCache Instructions current as of 10-31-2005) to set up Postgres.  The part pertaining specifically to the Replica Manager is below (make sure you've installed the dCache core RPM, v1.5.2-80 or higher):
    # Initialize db tables for the Resilience Manager
    # This step requires the dcache-core RPM (v 1.5.2-80 or higher) to be installed
    psql -d replicas -U srmdcache -f /opt/d-cache/etc/pd_dump-s-U_enstore.sql
  4. As of the Replica Manager version 2.0, it only monitors pools in a certain group, as specified in the resilientGroupName option.  To create a group, you must be able to connect to the dCache Admin cell via ssh.  Check out the dCache book to learn how to do this.  Below is a short session that shows how to create the pool group "ResilientPools" and add the pools "node1", "node2", and "node3" to this.
    $ ssh -l admin -p 22223 -c blowfish thpc-1.unl.edu
    admin@thpc-1.unl.edu's password:

    dCache Admin (VII) (user=admin)


    (local) admin > cd PoolManager
    (PoolManager) admin > psu create pgroup ResilientPools
    (PoolManager) admin > psu addto pgroup ResilientPools node1
    (PoolManager) admin > psu addto pgroup ResilientPools node2
    (PoolManager) admin > psu addto pgroup ResilientPools node3
  5. You should now be ready to start the Replica Manager:
    /opt/d-cache/jobs/replica -logfile=/opt/d-cache/log/replica.log start
    Check out the messages in /opt/d-cache/log/replica.log, and start debugging!  If you ever need to stop the manager, try:
    /opt/d-cache/jobs/replica -logfile=/opt/d-cache/log/replica.log stop

Using the Admin interface to the Replica Manager

You can control the Replica Manager via the admin interface.  Below is how to get to the cell:
$ ssh -l admin -p 22223 -c blowfish thpc-1.unl.edu
admin@thpc-1.unl.edu's password:

dCache Admin (VII) (user=admin)


(local) admin > cd replicaManager
(replicaManager) admin > help # Prints out a list of the possible commands.

Command Explanations

Command
Explanation
ls pnfsid [<pnfsId>]
List the pools that currently contain the file with <pnfsId>
exclude <pnfsId>
Excludes <pnfsId> from replication
ls unique <pool>
Lists all of the unique pnfsIds in <pool>.  If there are no unique pnfsIds in the pool, you can take it offline without losing any files.  In my experience, this may not always be correct.
replicate <pnfsId>
Creates a replica of <pnfsId>
update <pnfsId> [-c]
Updates the list of pools containing <pnfsId>; '-c' confirms this list with the pools.
reduce <pnfsId>
Reduces the number of copies of pnfsID if possible.
set pool <pool> <state>
Sets <pool> into state <state>.  See the below table for the list of states.
copy <pnfsId> <sourcePool> <destinationPool>
Copies the file <pnfsId> from <sourcePool> to <destinationPool> forcefully; this does not check for sufficient free space in the destination.
task ls
Lists the pending tasks
show pool <pool> Shows the status of pool <pool>

Pool States

State Explanation
Online Normal operation; files are readable and counted by the Replica Manager.
Offline
Temporary stoppages.  Files are not readable, but the replicas in the pool are still counted in the total.  This is meant for when the pool is going down for a short time, but will reappear with no files missing
Down
Permanent stoppage.  Files are not readable, and not included in the replica count.  This is meant for pools that are not going to come back (i.e., nodes are retired).
Drainoff
Transient state between online and down.  Any files whose replica would go below the minimum count when the pool goes into the 'down' state will be copied onto a different pool while the pool is in this state.  This may trigger massive replication.  To check if the pool is ready to be put in the down state, log in to the admin interface, cd replicaManager, and use the command 'ls unique <pool>' to make sure there are 0 unique files on <pool>.
Offline-Prepare
Transient state between online and offline.  Any files whose replica would go below the minimum count when the pool goes into the 'down' state will be copied onto a different pool while the pool is in this state.  This may trigger massive replication.  To check if the pool is ready to be put in the down state, log in to the admin interface, cd replicaManager, and use the command 'ls unique <pool>' to make sure there are 0 unique files on <pool>.

Draining Pools

If you've read the above sections on using the Replica Manager, an obvious recipe for draining a pool should come into mind.  Suppose you want to drain a large pool, node0 in this example, of all its dCache files.  One nice solution would be to use the CopyManager to vacate the pool, as described in the dCache book.  The drawback to this is the pool you copy node0's files to must have at least as much file space free on it as node0 has file space occupied; this was not a possibility at our site.  Instead, one may follow these steps:
  1. Shutdown the normal Replica Manager.
  2. In the Pool Manager, create a pool group named "node0Drainoff", and add node0.  Then, add additional nodes so the sum of the free space of the additional nodes is at least the amount of space used on node0.  You might want to check out the usage info graphs on your dCache's webpage (see the documentation on the web interface).
  3. Edit /opt/d-cache/config/replica.batch.  Set the new resilientGroupName to "node0Drainoff" and the minimum number of replicas to 1.  Make sure the debug output is on and the printout is set to 3.  Remember your old settings, and restore them once this process is done.
  4. Start the Replica Manager.  Wait a couple of minutes (it takes my site about 5 min. for inventory) before the next step.  Watch the logfiles to see when things settle down.  I usually try the next step once I see a message like this:
    05/09 23:04:24 Cell(replicaManager@replicaDomain) : DEBUG: runAdjustment - scan DrainOff
  5. Log in to the replica Manager through the admin interface. Type 'ls unique node0' and make sure there are unique files on this pool.  If it comes up as zero, and you think it should be nonzero, you might not have waited long enough.  Restart the Replica Manager and try again.  This seems to not work every time for me.
  6. After you've confirmed you have unique files on the pool, type 'set pool node0 drainoff'.  This will cause the Replica Manager to trigger the replication of all the unique files on your pool.
  7. Watch the log /opt/d-cache/log/replica.log; you should see debug output stating when jobs are started and when they are finished.  Wait until all the jobs have finished.
  8. Shutdown the pool.  Wait for a few hours (or a few days, depending on how much you've trusted this process) to make sure you don't run into any missing files.  I have a script that will confirm that dCache is in a healthy state; the beta version is here.
  9. Once you're convinced all the files have been replicated, you can clean out the "data" directory on the pool's file system, physically deleting the files.
  10. Restore your old settings to /opt/d-cache/config/replica.batch.  Restart the original Replica Manager.

Powered by Plone, the Open Source Content Management System