XRootD Data Popularity Migration to Hadoop
Main Page
Main page describing all
CMS data popularity tools
https://twiki.cern.ch/twiki/bin/view/CMS/CMSDataPopularity
PAGE UNDER CONSTRUCTION
Goal
Here we will describe why and how we want to address this migration
The Dashboard
XRootD monitoring system is going to migrate from Oracle to Hadoop.
CMS XRootD data popularity can profit of this migration to extend the data popularity monitoring to all WLCG sites, without having to maintain the data collection workflow.
Details about this work are reported in
Description
Documentation
Input datasets
- XRootD popularity requires the following information:
- the catalog of files/block/datasets in PhEDEx
- the xrootd file access logs
- for EOSCMS, these are collected in two ways: through EOSCMS logfiles, or through xrootd monitoring.
- for any other xrootd server in AAA, these are collected through xrootd monitoring
- These datasets are now regularly imported into the Hadoop cluster. For more details see CMSComputingAnalyticsDatasets
XRootD Oracle Materialized View SQL
Current agent implementation
New design
Proof of concept
Here we describe the initial steps to get familiar with the aggregations in hadoop.
First MR aggregation
Pig experience
Spark???
--
DomenicoGiordano - 2015-01-21