Utilizing the Billing Graphs Database
A summary of the data available in the billing graph schema
Schema
If you install the billing graphs, you must also upgrade the billing graphs DB schema. This schema upgrade allows much more painless queries of the dCache billing database through the SQL view simply called "info". Info has the following columns:
| Column Name: | Type: | Notes: |
|---|---|---|
| datestamp | timestamp | The timestamp of when this transaction (read: data transfer) ended |
| protocol | string | The protocol string; possibilities are GFtp-1.0.0 or DCap-3.0.0. |
| isnew | boolean | True if the file is new (the transaction was a write); false if the file was not new (transaction was a read) |
| local | string | The local dCache pool involved in the transfer. |
| remote | string | The remote client involved not in dCache. Note: for GridFTP third party transfers, this is the client, not the remote GridFTP server |
| door | string | The door (server) used for the transfer |
| owner | string | The owner of the transfer; for GridFTP, this is the client's DN; for DCap, this is "unknown" |
| file | string | Filename involved in the transfer; if no filename is available, this is the PNFS ID; if neither is available, this is simply "unknown" |
| pool_error | int | The pool error code (0 for no errors) |
| door_error | int | The door error code (0 for no errors) |
| pool_message | string | Any error messages coming from the pool |
| door_message | string | Any error messages coming from the door |
| connectiontime | float | The amount of time that data was transferred for; in milliseconds |
| transfersize | float | The number of bytes transferred in total |
We now document a few common questions which can be answered using the billing database.
Queries
Average Transfer Rate by Remote Site and GridFTP server
select date(datestamp) as date, door, remote, (average(transfersize/connectiontime))::int as kbps from info whereThis shows the average transfer rate for between your GridFTP servers and T1 FTS servers. This is great for debugging TCP rates.
remote ~ 'gridpp|pic|fnal|fzk|in2p3|tw|cern|cnaf' and
datestamp > '2008-06-06'
group by date, door, remote
order by date desc
limit 20;
Heavily accessed CMS directories
SELECTThis shows the number of reads coming out of each of the directories (roughly corresponding to a dataset) owned by CMS.
substring(file from 'cms/store(.*/.*)/.*/') as dataset,
count(*) as "Number of Reads"
FROM info
WHERE
datestamp >= :starttime AND
datestamp < :endtime AND
protocol='DCap-3.0' AND
file ~ 'cms'
GROUP BY dataset
ORDER BY "Number of Reads" desc
LIMIT 20
Number of errors, broken down by pool
SELECT
local,
trunc( date_part('epoch', datestamp) / (:span) ) * :span,
SUM(CASE WHEN pool_error!=0 THEN 1 ELSE 0 END) as failures
FROM
info
WHERE
datestamp >= timestamp :starttime - interval '6 hours' and
datestamp < :endtime and
local ~ :poolname AND remote ~ :remote AND protocol ~ :protocol
GROUP BY
trunc( date_part('epoch', datestamp) / (:span) ) * :span,
local
ORDER BY 1, 2 asc
Total number of gigabytes transferred by protocol
select
split_part(protocol, '-', 1) ,
sum(transfersize)/1024^3
from info
where datestamp >= :starttime
AND datestamp < :endtime
AND True
group by split_part(protocol, '-', 1)
Overall, these queries are a small selection of what's available - it's a "sky's the limit" kind of situation. Future work we'd like to see
- List of least recently used datasets.
- Means to discover "abnormally slow" TCP connections.
- Categorization of failure modes.