default
Up one levelAdding / Removing a dCache pool
In the admin interface, do the following to add a pool to be both read and write:
cd PoolManager
psu addto pgroup outside <poolname>
psu addto pgroup outside-write <poolname>
cd PoolManager
psu removefrom pgroup outside <poolname>
psu removefrom pgroup outside-write <poolname>
- To prevent a pool from being read from or written to, remove it from group outside or outside-write, respectively.
- Add the pool to group default in order to get it replicated. As a rule, replicate all pools which are under 6 months old!
- The file /home/brian/projects/MonalisaAlerts/cell-whitelist.txt maintains a list of cells which must be up. Add the cell name there.
Finally, after all changes are made, save:
save
- Category(s)
- default
Hostname for thpc-1
The hostname for thpc-1 should be set as thpc-1.unl.edu, for reverse hostname lookup. The entry should look like this:
172.16.100.2 thpc-1.unl.edu thpc-1
If "thpc-1" is before "thpc-1.unl.edu", then dCache resolves the hostname wrong, preventing SRM transfers from succeeding.
- Category(s)
- default
Reboot recovery information
While I (Brian) am gone on my honeymoon, here is some information on starting up various needed services on a couple non-standard systems:
On osg-test2, if the hardware is unexpectedly rebooted, make sure the following four services are back up:
1) Apache - it should restart automatically, and is in /etc/init.d where you'd expect it
2) Zope server. To restart:
sudo su - zope
zopectl start
3) MonaLisa repository. To restart:
sudo su - monalisa
~/MLrepository/start.sh
4) Start condor
Carl and Mako have sudo access; the passwd and shadow files from red are sync'd onto osg-test2, so your login names and passwords ought to be the same.
On osg-test1, make sure condor is started. Martin Feller, CC'd on this email, has sudo access to this server, and will be able to restore other services he needs.
Condor on osg-test1 requires Condor daemons to be running on the worker nodes and condor to be started on the headnode; this is done through /etc/init.d/condor.boot as you expect. Also on red.unl.edu, start the MonaLisa alerts service if alerts do not show up on the t2.unl.edu webpage:
/etc/init.d/ML-alerts start
Carl and I sat down and covered basic dCache and Phedex troubleshooting this morning. I believe that Mako already knows basic dCache functionality also.
- Category(s)
- default
Foundry Router and Setting up Routes
We have two 48-port patch panels (labelled module 2 and 3) on the private
network (172.16.0.0). We have one 48-port patch panel (labelled module 4)
on the public network (129.93.243.0).
The connection to the wall is on port 1 of the public network patch panel.
These nodes can connect to the private network via a gateway on the switch
with the IP address 129.93.243.228. They connect to the outside world
through the gateway on Dale's router, 129.93.239.254.
The private network can access the public network patch panel through the
router on the switch, 172.16.0.8. This way, all of the dcache nodes use
the switch to connect to the compute nodes instead of going through one
NAT. To reach the outside world, they will have to go through the NAT
(still 172.16.0.62).
To get this to work correctly, we need to set up routes on each node.
For a node on the private network:
/sbin/route add default gw 172.16.0.62 eth0
/sbin/route add -net 129.93.239.0 netmask 255.255.255.0 gw 172.16.0.8 eth0
For a node on the public network:
/sbin/route add default gw 129.93.239.190 bond0
/sbin/route add -net 172.16.0.0/16 gw 129.93.243.188 bond0
Of course, eth0 and bond0 may need to be changed. Please note that the
headnode is NOT on the switch. Therefore, the routes should not be
changed. All of the dcache nodes:
thpc-1 -> 4
dcache01, 3, 4
dcache-s01, s02
are channel bonded.
- Category(s)
- default
- The URL to Trackback this entry is:
- http://osg-test2.unl.edu/ce-changelog/foundry-router-and-setting-up-routes/tbping
LibTiff 32 bit
In order to make root (the program) work for interactive users I had to install the 32bit version of libtiff:
rpm -i /var/www/html/SLC305-32/SL/RPMS/libtiff-3.5.7-22.el3.i386.rpm
This is probably not needed on the worker nodes unless we want to use them or interactive use someday.
--C
- Category(s)
- default
Setting up default routes
The trick to set up the ganglia route in rc.local doesn't seem to work well for any of the computers attached to red.unl.edu.
Instead, create a file called /etc/sysconfig/network-scripts/route-eth1 (or route-eth0 if necessary). This will bring up a default route whenever ethX is started. Add one routing entry per line. The contents of each line are what you would put after "/sbin/route add ...". So,
/sbin/route add 239.2.11.72 dev eth1
becomes the line
239.2.11.72 dev eth1
in the file route-eth1.
- Category(s)
- default
Getting the Prima Callouts to work right
In order to get the prima callouts to work correctly the prima/lib directory has to be in the user's LD_LIBRARY_PATH.
To fix this I add ($VDT_LOCATION)/prima/lib to the LD_LIBRARY_PATH in
the setup.sh file in the OSG directory. That way it gets picked
up on login for all users.
- Category(s)
- default
Getting the reporting to work at DiSUN
In order for red to report to DiSUN correctly, a flag has to be manually set in the ml.properties file:
$VDT_LOCATION/MonaLisa/Service/VDTFarm/ml.properties
Change:
lia.Monitor.group=OSG
to
lia.Monitor.group=OSG,uscmst2
- Category(s)
- default