XRootD Data Popularity Migration to Hadoop

Main Page

Main page describing all CMS data popularity tools https://twiki.cern.ch/twiki/bin/view/CMS/CMSDataPopularity

PAGE UNDER CONSTRUCTION

Goal

Here we will describe why and how we want to address this migration

The Dashboard XRootD monitoring system is going to migrate from Oracle to Hadoop. CMS XRootD data popularity can profit of this migration to extend the data popularity monitoring to all WLCG sites, without having to maintain the data collection workflow. Details about this work are reported in

Description

Documentation

Input datasets

  • XRootD popularity requires the following information:
    • the catalog of files/block/datasets in PhEDEx
    • the xrootd file access logs
      • for EOSCMS, these are collected in two ways: through EOSCMS logfiles, or through xrootd monitoring.
      • for any other xrootd server in AAA, these are collected through xrootd monitoring
  • These datasets are now regularly imported into the Hadoop cluster. For more details see CMSComputingAnalyticsDatasets

XRootD Oracle Materialized View SQL

FileToDatasetAssociation

Current agent implementation

New design

Proof of concept

Here we describe the initial steps to get familiar with the aggregations in hadoop.

First MR aggregation

Pig experience

Spark???

-- DomenicoGiordano - 2015-01-21

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2017-08-27 - CarlVuosalo
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback