DataKnowledgeCatalogTest

Introduction

This page describes the data that was imported to ElasticSearch cluster at cl-analytics to test performance of Data Knowledge Catalog queries implemented in ElasticSearch.

Intended usage

This data is intended to use in performance testing queries useful for DKC analysis when run on ElasticSearch.

Test query examples (in SQL):

SELECT DISTINCT DATASET_NAME FROM MARIATABLE WHERE PHYS_GROUP='PHYS' AND T_UPDATE_TIME BETWEEN TO_DATE ('2015/08/01', 'yyyy/mm/dd') AND TO_DATE ('2015/09/01', 'yyyy/mm/dd');

SELECT DISTINCT A.DATASET_NAME FROM MARIATABLE A JOIN MARIATABLE B ON A.DATASET_NAME = B.DATASET_NAME AND A.PHYS_GROUP='PHYS' AND B.DATASET_NAME='HIGGS';

SELECT DISTINCT PHYS_GROUP FROM MARIATABLE WHERE DATASET_NAME='mc12_13TeV.206473.aMcAtNloHerwigpp_UEEE4_CT10ME_bbH_yb2_tautau_lephad_200.evgen.log.e3613_tid05020254_00' AND T_UPDATE_TIME BETWEEN TO_DATE ('2015/08/01', 'yyyy/mm/dd') AND TO_DATE ('2015/09/01', 'yyyy/mm/dd');

Data volume and source

The volume of imported data is 42 Mb.

The data was created using the following SQL query:

select r.phys_group as phys_group,

r.pr_id as request_id,

t.taskid as task_id,

d.name as dataset_name,

r.campaign as r_campaign,

r.description as r_description,

r.energy_gev as r_energy_gev,

r.exception as r_exception,

r.is_fast as r_is_fast,

r.locked as r_locked,

r.manager as r_manager,

r.project as r_project,

r.provenance as r_provenance,

r.reference as r_reference,

r.reference_link as r_reference_link,

r.request_type as r_request_type,

r.status as r_status,

r.sub_campaign as r_sub_campaign,

t.bug_report as t_bug_report,

t.bunchspacing as t_bunchspacing,

t.campaign as t_campaign,

t.chain_tid as t_chain_tid,

t.comments as t_comments,

t.current_priority as t_current_priority,

t.dsn as t_dsn,

t.dynamic_job_definition as t_dynamic_job_definition,

t.inputdataset as t_inputdataset,

t.is_extension as t_is_extension,

t.nfilesfailed as t_nfilesfailed,

t.nfilesfinished as t_nfilesfinished,

t.nfilesonhold as t_nfilesonhold,

t.nfilestobeused as t_nfilestobeused,

t.nfilesused as t_nfilesused,

t.parent_tid as t_parent_tid,

t.phys_group as t_phys_group,

t.phys_short as t_phys_short,

t.physics_tag as t_physics_tag,

t.pileup as t_pileup,

t.postproduction as t_postproduction,

t.pptimestamp as t_pptimestamp,

t.priority as t_priority,

t.prodsourcelabel as t_prodsourcelabel,

t.project as t_project,

t.provenance as t_provenance,

t.reference as t_reference,

t.simulation_type as t_simulation_type,

t.start_time as t_start_time,

t.status as t_status,

t.step_id as t_step_id,

t.subcampaign as t_subcampaign,

t.submit_time as t_submit_time,

t.taskname as t_taskname,

t.timestamp as t_timestamp,

t.total_done_jobs as t_total_done_jobs,

t.total_events as t_total_events,

t.total_req_events as t_total_req_events,

t.total_req_jobs as t_total_req_jobs,

t.update_owner as t_update_owner,

t.update_time as t_update_time,

t.username as t_username,

t.vo as t_vo,

d.phys_group as d_phys_group,

d.events as d_events,

d.files as d_files,

d.status as d_status,

d.timestamp as d_timestamp,

d.campaign as d_campaign,

d.container_flag as d_container_flag,

d.container_time as d_container_time

from t_prodmanager_request r, t_production_task t, t_production_dataset d

where r.pr_id = t.pr_id

and t.pr_id = d.pr_id

and t.taskid >= 5000000 and

r.project not in ('user','valid1','valid2','valid3','mc_evind') and

r.phys_group in ('SUSY', 'PHYS', 'HIGG') and

d.taskid = t.taskid

order by phys_group, request_id, task_id, dataset_name


Major updates:
-- MaksimGubin - 2015-12-15

Responsible: MaksimGubin
Last reviewed by: Never reviewed

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2016-05-16 - IlijaVukotic
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Atlas All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback