Evaluation of ElasticSearch

Current evaluation

Hassen has performed an evaluation of elasticsearch with the Aggregation module comparing the results with last year's evaluation conducted by David. Results show that EAG is faster that EIG, EQG and ENG.

David's presentation from last year is available at: https://indico.cern.ch/event/268067/contribution/0/material/slides/1.pdf Hassen's presentation is available at: https://dl.dropboxusercontent.com/u/28784243/ES_v6.pdf

Development Cluster

  • Application Server: dashb-ai-613.cern.ch (4GB/2VCPU)
  • Search: dashb-ai-655.cern.ch (8GB/4VCPU)
  • Master: dashb-ai-654.cern.ch (4GB/2VCPU)
  • Data nodes: p05496706h02393/p05496706g87265 (physical node: 62GB/32cores)

Repository

ES based UI

Next Steps 22-01-2015

  • Add in the twiki the curl command used for both transfer and matrix plot + result json as attachment.
  • Repeat the read tests for 6/12 months indexes selecting interval distributed across all index time-range, update tables.
  • Investigate how to aggregate results on different time bins (e.g. 1 hour, 1 day) using ES functionality.
  • Perform test with concurrent read and bulk-write:
    • On the 6 months index
    • create a cron that every 10 minutes updates the last 48 hours of data (~ 200K rows)
    • reapeat the read tests measurement on all the different time intervals
      • try to test both in synch and not in synch with the actual update operation.

Next Steps

Concentrate on ES measurement only for now, do not measure the time for the app. server

  1. Perform a number of queries with curl for different intervals (4 hours, 12, 24, 48, 72, 96) and different filtering and grouping criteria on the current cluster and write the result in the table 1
  2. change number of shards to 5 for example, re-do point 1 and write the result in the table 2. If we see a big performance improvement we stick to 5 shards and we no longer performance the benchmark with 2 shards
  3. Add 3 months of data in the cluster, re-do points 1 and 2 and write the result in the table 3
  4. Add more datanodes (one or two VMs) to the cluster and re-do points 1, 2 and write the result in the table 4
As soon as we have the table ready, we should also measure the time it takes for the Dashboard server to encode the data and to investigate an efficiency way to do this - maybe changing the way apache runs might help (prefork or worker mode).

ES crash

  • Running 10 parallel process reading 100MB each
  • The mater node raises this error:
{"error":"ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/2/no master];]","status":503}
  • Then it becomes entirely unavailable:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/dashboard_3months/_status">GET&nbsp;/dashboard_3months/_status</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>

Queriying ES

curl http://dashb-es-dev.cern.ch:9250/dashboard_3months/_search/ -d '{"query": {"constant_score": {"filter": {"and": [{"range": {"period_end_time": {"to": "2014-11-10T01:00:00", "from": "2014-11-10T00:00:00", "include_lower": false}}}, {"terms": {"vo": ["atlas"]}}]}}}, "from": 0, "aggs": {"daterange": {"filter": {"and": [{"range": {"period_end_time": {"to": "2014-11-10T01:00:00", "from": "2014-11-10T00:00:00", "include_lower": false}}}, {"terms": {"vo": ["atlas"]}}]}, "aggs": {"src_site": {"terms": {"field": "src_site", "size": 0}, "aggs": {"src_host": {"terms": {"field": "src_host", "size": 0}, "aggs": {"period_end_time": {"terms": {"field": "period_end_time", "size": 0}, "aggs": {"errors_xs": {"sum": {"field": "errors_xs"}}, "files_xs": {"sum": {"field": "files_xs"}}, "bytes_xs": {"sum": {"field": "bytes_xs"}}}}}}}}}}}, "size": 0 }'

Validation queries

ES:

  • query: q = {"aggs": {"daterange": {"filter": {"and": [{"range": {"period_end_time": {"to": "2014-11-10T05:00:00", "from": "2014-11-10T01:00:00", "include_lower": False}}}, {"terms": {"vo": ["atlas"]}}]}, "aggs": {"src_site": {"terms": {"field": "src_site", "size": 0}, "aggs": {"src_host": {"terms": {"field": "src_host", "size": 0}, "aggs": {"dst_site": {"terms": {"field": "dst_site", "size": 0}, "aggs": {"dst_host": {"terms": {"field": "dst_host", "size": 0}, "aggs": {"errors_xs": {"sum": {"field": "errors_xs"}}, "files_xs": {"sum": {"field": "files_xs"}}, "bytes_xs": {"sum": {"field": "bytes_xs"}}}}}}}}}}}}}, "size": 0}

  • Returns: 1281 rows

Oracle:

  • select count (*) from t_tfrs_stats where vo = 'atlas' and period_end_time >= '10-NOV-14 01.00.00.000000000 AM' AND period_end_time <= '10-NOV-14 05.00.00.000000000 AM' group by src_site, src_host, dst_site, dst_host

  • Returns: 1290 rows

Results

1 month of data

  • Index size:
    • 2 shards: 680Mi (1.33Gi)
    • 5 shards: 725Mi (1.42Gi)
    • 8 shards: 744Mi (1.45Gi)
  • Number of docs: 4,060,513

Table1: One month of data, grouping src_site, src_host, dest_site, dest_host: Matrix plot

Notes # total row # filtered rows # returned rows Time 2 shards Time 4 shards Time 5 shards Time 8 shards Datasize
4 hours, VO: atlas 18549 14354 1283 0.15 0.14 0.12 0.11 o(k)
24 hours, VO: atlas 103238 73236 1971 0.25 0.22 0.19 0.24 o(k)
48 hours, VO: atlas 203964 140677 2359 0.23 0.22 0.20 0.19 o(k)
1 week, VO: atlas 874334 640548 8180 0.57 0.49 0.49 0.46 o(k)
2 week, VO: atlas 1988568 1521254 8863 0.84 0.65 0.61 0.59 o(k)
4 weeks, VO: atlas 3639643 2841951 9129 1.19 0.83 0.73 0.70 o(k)

Table2: One month of data, grouping src_site, src_host, dest_site, dest_host, period_end_time (10 min bins): Transfer plot

Notes # total row # filtered rows # aggregated rows Time 2 shards Time 4 shards Time 5 shards Time 8 shards Datasize Particular resources usage
4 hours, VO: atlas 18549 14354 11223 0.59 0.36 0.34 0.29 1944k 30 % of CPU usage as USER
24 hours, VO: atlas 103238 73236 60498 1.22 0.94 0.90 0.87 9760k
48 hours, VO: atlas 203964 140677 117030 2.24 1.67 1.55 1.44 18.2M
1 week, VO: atlas 874334 640548 528502 7.5 7.25 7.2 6.62 82.5M
2 week, VO: atlas 1988568 1521254 ? 17.9 16.19 17.00 15.3 189M
4 weeks, VO: atlas 3639643 2841951 ? 35.02 31.40 30.02 28.09 327M

3 months of data

  • Index size:
    • 5 shards: 1.93Gi (3.85Gi)
  • Number of docs: 10,720,437

Table3: 3 months of data (5 shards), grouping src_site, src_host, dest_site, dest_host: Matrix plot

Notes # total row # filtered rows # aggregated rows Time Datasize Particular resources usage for queries running for more than 10 sec
24 hours, VO: atlas 103238 73236 1971 0.19 o(k)  
1 week (last), VO: atlas* 874334 640548 8180 0.52 o(k)  
1 week (random), VO: atlas* 1181027 963326 6083 0.52 o(k)  
4 weeks (last), VO: atlas* 3639643 2841951 9129 1.01 o(k)  
4 weeks (random), VO: atlas* 1181027 963326 8507 0.88 o(k)  
3 months, VO: atlas 10474568 7865270 9809 1.57 1345k  

Table4: 3 months of data (5 shards), grouping src_site, src_host, period_end_time (10 min bins): Transfer plot

Notes # total row # filtered rows # aggregated rows Time Datasize Particular resources usage for queries running for more than 10 sec
24 hours, VO: atlas 103238 73236 12442 0.47 1987k  
1 week (last), VO: atlas 874334 640548 82663 2 12.9M  
1 week (random), VO: atlas 1181027 963326 78861 2.12 12.3M  
1 week (random) - 1h bin, VO: atlas 1181027 963326 15684 0.8 2570k  
4 weeks (last), VO: atlas 3639643 2841951 276176 6.87 43.1.4M  
4 weeks (random), VO: atlas 3557598 2573061 320889 8.24 50.3M  
4 weeks (random) - 1day bin, VO: atlas 3557598 2573061 3198 0.62 550k  
3 months, VO: atlas 10474568 7865270 949985 23.45 188M  

6 months of data

  • Index size:
    • 5 shards: 3.06Gi (6.12Gi)
  • Number of docs: 16,433,503

Table5: 6 months of data, grouping src_site, src_host, dest_site, dest_host: Matrix plot

Notes # total row # filtered rows # aggregated rows Time Datasize Particular resources usage for queries running for more than 10 sec
24 hours, VO: atlas 103238 73236 1971 0.19 o(k)  
1 week, VO: atlas 874334 640548 8180 0.53 o(k)  
4 weeks, VO: atlas 3639643 2841951 9129 0.9 o(k)  
3 months (last), VO: atlas 10474568 7865270 9809 1.74 1345k  
3 months (random), VO: atlas 8870502 6218347 9151 1.30 -  
6 months, VO: atlas 11309203 16208385 10130 2.42 1408k  

Table6: 6 months of data, grouping src_site, src_host, period_end_time (10 min bins): Transfer plot

Notes # total row # filtered rows # aggregated rows Time Datasize Security overhead Particular resources usage for queries running for more than 10 sec
24 hours, VO: atlas 103238 73236 12442 0.45 1987k 0.46  
1 week, VO: atlas 874334 640548 82663 2.08 41.1M 1.98  
4 weeks, VO: atlas 3639643 2841951 276176 6.90 82.4M 6.4  
3 months (last), VO: atlas 10474568 7865270 949985 24.79 188M 22.7  
3 months (random), VO: atlas 6954701 5068013 938178 19.39 147M  
3 months (random), 1 week bin, VO: atlas* 6954701 5068013 1574 1.05 282k  
6 months, VO: atlas 11309203 16208385 ? ? 318M  

1 year of data

  • Index size:
    • 5 shards: 5.78Gi (11.6Gi)
  • Number of docs: 29,302,801

Table7: 1 year of data, grouping src_site, src_host, dest_site, dest_host: Matrix plot

Notes # total row # filtered rows # aggregated rows Time Datasize Particular resources usage for queries running for more than 10 sec
24 hours, VO: atlas 103238 73236 1971 0.18 o(k)  
1 week, VO: atlas 874334 640548 8180 0.53 o(k)  
4 weeks, VO: atlas 3639643 2841951 9129 0.87 o(k)  
3 months, VO: atlas 10474568 7865270 9809 1.49 1345k  
6 months, VO: atlas 11309203 16208385 10130 2.06 1408k  
12 months, VO: atlas 19652212 29196774 16138 3.28 2839k  

Table8: 1 year of data, grouping src_site, src_host, period_end_time (10 min bins): Transfer plot

Notes # total row # filtered rows # aggregated rows Time Datasize Particular resources usage for queries running for more than 10 sec
24 hours, VO: atlas 103238 73236 12442 0.48 1987k  
1 week, VO: atlas 874334 640548 82663 2.04 41.1M  
4 weeks, VO: atlas 3639643 2841951 276176 5,66 82.4M  
3 months, VO: atlas 10474568 7865270 949985 20.69 188M  
6 months, VO: atlas 11309203 16208385 ? ? 318M  
12 months, VO: atlas 19652212 29196774 ? ? ?  

Table9 (26/02): 1 year of data, grouping src_site, src_host, period_end_time (10 min bins)/ 1 week data: Transfer plot

Notes Time Datasize Particular resources usage for queries running for more than 10 sec
1 week, VO: atlas , 1p 2.04 41.1M  
1 week, VO: atlas , 2p 3.04    
1 week, VO: atlas , 5p 5.73    
1 week, VO: atlas , 10p 9.72    
1 week, VO: atlas , 50p 36.72   Client machine is swapping

Table10 (05/03): 1 year of data, grouping src_site, src_host, period_end_time (10 min bins)/ 4 weeks data: Transfer plot

Notes Time Datasize Particular resources usage for queries running for more than 10 sec
4 weeks, VO: atlas , 1p 6,4 82.4M  
4 weeks, VO: atlas , 2p 10.73    
4 weeks, VO: atlas , 5p 17.62    
4 weeks, VO: atlas , 10p 32.33   it was crashing before the config. change, Client machine is swapping
4 weeks, VO: atlas , 50p      

Table11 (19/03): 1 year of data, grouping src_site, src_host, period_end_time (10 min bins)/ 1 day data: Transfer plot

Notes Time Datasize Particular resources usage for queries running for more than 10 sec
1 day, VO: atlas , 1p 0.48 1987k  
1 day, VO: atlas , 2p 0.50    
1 day, VO: atlas , 5p 0.82    
1 day, VO: atlas , 10p 1.08    
1 day, VO: atlas , 50p 3.46   no swapping

Table12 (19/03): 1 year of data, grouping src_site, src_host, period_end_time (10 min bins)/ 1 week data/ 1 hour agg. : Transfer plot

Notes Time Datasize Particular resources usage for queries running for more than 10 sec
1 week, VO: atlas , 1p 0.71 2570k  
1 week, VO: atlas , 2p 0.74    
1 week, VO: atlas , 5p 1.39    
1 week, VO: atlas , 5p per host (2 hosts) 1.83    
1 week, VO: atlas , 5p per lxplus host (2 hosts) 1.73    
1 week, VO: atlas , 10p 2.35    
1 week, VO: atlas , 10p per host (2 hosts) 3.05   was 1.7 --> 4. search: https://meter.cern.ch/public/_plugin/kibana/#dashboard/temp/AUzjIq8y8f4yLkYN6KjE master: https://meter.cern.ch/public/_plugin/kibana/#dashboard/temp/AUzjJUcX8f4yLkYN6KoR datanode:https://meter.cern.ch/public/_plugin/kibana/#dashboard/temp/AUzjJsEpcEp7gspd2Fsl
1 week, VO: atlas , 50p 7.16   no swapping. CPU/IO: https://cernbox.cern.ch/public.php?service=files&t=236d81ac7bbe13fcc710e31c3b50153f

XRootD Dashboard details

Statistics format

Xrootd monitoring produces 3 different types of statistic, which currently end up on 3 different oracle tables: transfers statistics, user activity and access patterns. Structure is similar (key= time bin+attributes), but not identical. The Key (aka ID) is composed by:

SRC_DOMAIN DST_DOMAIN USER_PROTOCOL IS_REMOTE_ACCESS IS_TRANSFER ACTIVITY PERIOD_END_TIME

Processing

Currently, MR batch jobs runs every 10 minutes mimicking what oracle jobs do. On each run, they "update" around 3 days of statistics, computed on 10 minutes bins. Rough estimation, this correspond to a bulk update of around 200K rows on Oracle, or document on ES, each 10 minutes But in its final configuration the processing will look a bit different, with continuous "real-time" flow + less frequent MR batch jobs. From ES perspective this correspond to 2 different data-flows: one smaller and frequent (e.g. O(10-100) documents updates every 10 seconds) and another big but rare ( e.g. 200K document update every hour).

'Auto' UI time bin resolution

These are the default time aggregation bins used by the UI (Transfer plots):

Time Interval Bin size
last hour 10 min
last 4 hours 10 min
last 12 hour 1 hour
last 24 hours 1 hour
last 48 hours 1 hour
last 7 days 1 day
last 14 days 1 day
last 28 days 1 day
Nevertheless, since the user is able to choose the preferred bin resolution, would be good to have read tests with time aggregation not only for the default resolution but for all the options.

Dataset details

  • Kibana dashboard
  • Time range: 2014-11-01 00:00:00 to 2015-04-28 12:40:00
  • Stats
  • Count: 15 569 185 - replicas not included
  • Size: 2 448 335 519 - replicas not included
  • Settings
  • 5 shards, 1 replica (+1 master)
  • key fields, ordered, to concatenate into ID:
    1. SRC_DOMAIN (string)
    2. DST_DOMAIN (string)
    3. USER_PROTOCOL ( always "xrootd")
    4. IS_REMOTE_ACCESS (0 or 1)
    5. IS_TRANSFER (0 or 1)
    6. ACTIVITY ('u' or 'r'. Any other possibilities?)
    7. PERIOD_END_TIME (datetime, 10-minute bins)
  • value fields, sum when aggregating, order not important:
    • ACTIVE (should it be averaged instead of added?)
    • ACTIVE_TIME (double)
    • BYTES (double)
    • FINISHED (int)
  • metadata field, ignored:
    • UPDATE_TIME (datetime)

Aggregation query

Aggregations on all key fields except USER_PROTOCOL and DST_DOMAIN. Effectively sums over all DST_DOMAIN.

agg_query:
  size: 0
  query:
    filtered:
      filter:
        range:
          PERIOD_END_TIME:
            to: "2013-12-11T00:00:00"
            from: "2013-11-10T00:00:00"
            include_lower: False
  aggs:
    src_domain:
      terms:
        field: SRC_DOMAIN
        size: 0
      aggs:
        is_remote_access:
          terms:
            field: IS_REMOTE_ACCESS
            size: 0
          aggs:
            is_transfer:
              terms:
                field: IS_TRANSFER
                size: 0
              aggs:
                activity:
                  terms:
                    field: ACTIVITY
                    size: 0
                  aggs:
                    period_end_time:
                      date_histogram:
                        field: PERIOD_END_TIME
                        interval: hour
                      aggs:
                        active:
                          sum:
                            field: ACTIVE
                        finished:
                          sum:
                            field: FINISHED
                        bytes:
                          sum:
                            field: BYTES
                        active_time:
                          sum:
                            field: ACTIVE_TIME

Time ranges

end_date: 2015-04-28T11:00:00

Indexing intervals:

  • 4 hours
  • 12 hours
  • 1 day
  • 3 days
  • 7 days
Aggregation query intervals:
  • 7 days
  • 30 days
  • 90 days
  • 365/2 days

Result plots

MoreAggregationResults

Minutes from meetings

11-12-2014:

  • Add in the twiki the curl query used and the json output of elasticsearch over a short period of time (in the order of minutes) in order to validate the rows returned after the aggregation
  • In the existing tables, add the number of records returned after the aggregation
  • Import 3 months of data with 5 shards only, without checking 2 and 8 shards, and redo the same queries
  • Pablo suggested that at a later stage we could split the index into daily indices or weekly indices depending on the size of that index
22-01-2015:
  • Hassen presented query time results on different index sizes (1-3-6-12 months), results are very promising
    • for the same time window (4 hr, 24hr, etc.) the elapsed time is ~ constant across different indexes
    • the elapsed time seems ~ linear with the number of returned rows
  • Note: test were done querying always on ~ most recent data on the index, test will be repeated with interval spread across all index.
  • Hassen mentioned he experienced for the first time a cache behaviour in ES, second query hit was much faster, reason to be understood.
  • In case of long time selection, the results is in the order of tens or hundreds of MB. Not being feasible to handle such amount of data for client side visualization, solution such as aggregation on different time bin (hourly/daily) are needed
    • ES functionality may be used to do that, investigation needed
    • Multiple time_binning can be computed by the aggregation layer and inserted on different ES indexes
  • Time-based index partitioning will be reconsidered after the performance evaluation with concurrent read/write.

12-02-2015:

  • Ivan reported that he installed a security plug-in on his local test cluster, but did not have time to check how it works. He will have some results by the next meeting in one week
  • Hassen presented query time results comparing query performance for time ranges selected from the latest data and any random part of index. No difference so far, which is good. Very good result using built-in Elasticsearch time aggregation

    • Hassen was asked to repeat similar tests , but selecting several intervals (same length) inside the index and create plots showing values which are presented in the table columns in order to make correlations more evident
    • In order to simplify the test the same index (1 year) should be used for all tests
    • The first hit shoud be taked into account, though would be also interesting to compare it with the second one
    • 3 months (and more) with 10 minutes aggregation test might not make much sense , since first it is not a real use-case, second the result might be misleading as it is network (bandwidth) driven.
  • Hassen experienced a problem simulating multi-user access. With 10 parallel threads reading 100MB each, the server crashed. The reason needs to be understood (out of memory?). Ivan will investigate the log files. After that Eddie will upgrade the test cluster, increasing number of cores of the master node from 2 to 4. Then Hassen will repeat the test trying different scenarios (gradual increasing number of threads, gradual increasing amount of data per thread)
  • For the test of the concurrent reading/writing , Hassen will create a script which will simulate update from the collector.

19-02-2015:

  • Ivan reported that he open-source security solution and apparently it works , but need to wait for the release which is expected by the end of this month
  • Long discussion about number of shards per index. Why do we have so many (8) while the size of index is small? Because we have it like this in out default template, might have been driven by number of nodes. Ivan will check whether big number of small shards has bad impact on the recovery time
  • Follow up on the failure caused by heavy parallel reading threads. Hassen thinks that we have a hard limit on the overall volume of accessed data ~1GB, While 10 threads of 70MB each is fine, 20 threads of 70 MB each again caused the cluster to crash. Ivan clarified that the search node ran out of heap. Eddie will increase the heap of the search node twice and then the test can be repeated. However it does not solve the problem of the crash of the cluster. THere should be a protection against too heavy queries. Eddie will investigate with the AI colleagues and with the ES development community.
  • Pablo suggested that we try ElsticSearch Curator for optomozation of the old indexes. Ivan agreed to do it.
  • Luca suggested to test whether access time (accessing same data sample) stays the same independently of the number of threads. Hassen agreed to test it.

06-03-2015:

  • Lionel decreased the size of the smallest index (yearly instead of daily) and asked Edie to put configuration of having 8 shards back for all indexes, which Eddie did.
  • Pablo modified partitions on the cluster and will make consistent number of replicas per index, always 2, 1 master and 1 replica.
  • Follow up on the failure caused by heavy parallel reading threads. Not yet clear whether the change of configuration provided by Eddie after googling helped to solve the problem of cluster crashing in case of 1GB of data retrieved through many parallel thread. The tests of Hassen did not reproduce exactly the same situation which caused the crash before. The results of preformed tests did not cause the crash. Apparently this results are not representative if we talk about elasticsearch performance, since the apparently pretty worse numbers in case of many parallel threads are rather defined by time required to transfer the output through the network. The following tests have to be performed:
    • Repeat exactly the same conditions which caused the cluster to fail to make sure that now the cluster does not crash
    • Repeat multi-threaded tests but for realistic use case, for example with hourly or daily statistic aggregation by Elsaticsearch, so that the output would be not more than 1-2MB
    • The same as previous but from multiple hosts
  • Need to test what we have as a security overhead and whether it is the same independently of the query (size of the output set).
  • Pablo suggested that we try Elasticsearch Curator for optimozation of the old indexes. Ivan agreed to do it.
12-03-2015:
  • Eddie: Updated number of shards to 2 (for newly created indices) and number of replicas to 1 (for all indices) for the mig* indices after discussing it with Lionel and Pablo.
  • Ivan:
    • Curator looks useful. Simple utility, no new logic, just helps to select date-series indexes and apply normal ES commands to them.
      • Where to run it from? It's pip install, and supports basic http auth.
    • https://github.com/abronner/elasticsearch-monitoring
      • Interesting idea, very simple: cron + python script (query _nodes/stats, save json in ES) + kibana dashboard
      • Outdated, but has newer forks.
      • Lemon uses Kibana now too. Can this be done with Lemon?
        • Pro: lemon daemon, not a cron job. Monitoring data not stored on the cluster itself.
        • Con: More work: lemon needs every metric to be defined. Some of the metrics from _nodes/stats are already in Lemon.
  • Pablo:
    • Installed tool to reindex documents elasticsearch-reindex on both clusters. It can also be used to move data from one cluster to another
19-03-2015:
  • Pablo:
    • Rearranged several indexes using elsaticsearch reindex (up to September 2014). Should be careful, some data was lost.
    • After discussion whether it is reasonable to have 2 shards even for very small indexes, it was decided that 2 might fine.
  • Ivan:
    • Curator looks useful.
    • Elastic Defender not yet released
  • Hassen:
    • Performed exactly the same test which caused crash of the cluster before Eddie performed configuration changes (see previous meetings), no crash so far, though it is not guaranteed that we just did not get to another crash limit. However if we consider our usage patter, we should be fine.
    • Out of test results looks like there is no performance overhead caused by our current security implementation
    • Overall, multi-threaded test results look good.
    • Next steps :
      • Perform multi-threaded tests from at least 2 different hosts
      • Prepare an overview of all performed tests in a form of some informal document loaded ta this twiki page
      • Would be good to present numeric results in a form of plot, so it is easier to follow
02-04-2015:
  • Pablo:
    • Created a backup of the development cluster. For 20GB of data it took ~ 1 hour. The incremental backup after 1 day, just 3 minutes. THe estimation is that on the production cluster the first backup can take the whole day. May be Ivan can do it during CERN X-mas break. Ivan did not join the meeting today.
    • One thing which Pablo noticed is that user id on the machines included in the cluster was not everywhere the same. Since we mount the shared file system we want ths user to be able to write there and therefore the user id should be the same of all machines of the cluster. For this puppet manifest should be changed
  • Hassen:
    • Looked through the plots created by Hassen.
    • For the next meeting in two weeks Hassen will create another plot which would combine measurements without aggregation, and with hourly and daily and probably weekly aggregation, out of yearly index , for various time ranges, which have to be indicated on the horizontal axe. Next to the dots on the plot would be nice to see amount of data. THe header should contain number of the table the data is coming from as well as more detailed explanation of the test content
    • For the parallel reading test, Hassen will repeat the tests. In addition to what he did he was asked to try 5 threads from two nodes , 5 threads per node. He was also asked to complement the test results with the lemon snapshots of the nodes of the cluster during the exact time of the test.
23-04-2015:
  • Ivan:
    • Automatic daily backups are enabled on both test and production clusters
    • UserId is fixed on both clusters
    • Curator is installed, it allows to select subset of indexes based on date and preform common operations on all of them
    • Monitoring is enabled. Should be run manually to collect certain monitoring statistics about cluster (memory, etc..) Kibana can be used for visualization of this info. The cronjob which invokes the monitoring with regular time intervals is setup on the test cluster
  • Eddie;
    • Elasticsearch defender is not yet released
  • Hassen:
    • Next steps:
      • Create plot with has data size on axe X and time the query takes on axe Y. locate there data with all aggregation available. The scale on site X should be respected, so we can see whether the distribution is linear. Add measurements with 6, 9 and 12 months excluding 10min aggregation.
      • For writing/update purposes, need to reorganize data in the cluster with the new indexes, corresponding to unique constrains in ORACLE. Luca will provide necessary info for new index definition. Ivan agreed to perform the task.

30-04-2015:

  • Present: Luca, Ivan, Pablo
  • Luca:
    • schema for xrtood and fts are not identical. He sent the key to Ivan.
    • We have a 6-month data for FAX.
  • Ivan:
    • nfs stale mountpoints prevent puppet from running
    • Inserted fts data, checking speed for inserting depending on the id. The id can be either random or computed on the client side
      • With a random id:
      • With id concatenating values
      • With hash: seems better for fts, worse for fax!!
    • 30 minutes to insert a bulk of 3 months of fts data
    • Created dashboard to monitor the cluster during the insertion of documents. Available here
    • Every 10 minutes we are reinserting the last 3 days of data (150k docs)
      • The _version of the document increases! Good!!
  • Hassen :
    • Created plot with time with a line size. It would be better to make it as a scatter plot
  • Actions:
    • Ivan will keep on writing, and create a table with the performance of writing
    • Hassen : Repeat read tests on FAX, having also the write * Read only the last 3 days, changing period to be read
    • Hassen: create scatter plot

07-05-2015:

  • Ivan worked through the set of plots showing performance of writing and reading tests.
    • According to the results, writing does not have important impact on the reading performance. Good.
    • Questions:
      • Why do we have such a big fluctuation of values , even those which are measured by ES. THough not clear how ES measurement is performed (Server side, or whn the result is completely returned to the client)
    • Next steps:
      • Repeat the same set of tests but using over time aggregation for reading to confirm that this won't have impact on the performance
      • Ensure that we have regularly running updates for days, not just for few hours
      • Concentrate on the realist use cases (regarding queries and correspondingly amount of returned data). Luca added info on the twiki describing realistic scenario
      • Add plots with number of records (not just data volume)

21-05-2015:

  • Ivan worked through the set of plots showing performance of writing and reading tests.
    • Again confirmed that writing does not have important impact on the reading performance. Good.
  • Next steps:
    • Asked Ivan to check whether the way he currently runs the query takes into account the fact that 10min bin time aggregation is a part of raw data structure and he takes an advantage of it.
    • Ivan was also asked to put code he used for his tests in the repository before he leaves
    • We need to enable constant update of the data in the ES from Hadoop. Luca will assign this task to people working with him on Hadoop.
    • We also need to integrate ES with the UI. Will ask Hassen whether he can take care about this task.

29-05-2015:

  • Next steps:
    • Agreed tat Ivan and Hassen would concentrate on integration of the Elasticsearch storage with the UI. There is a prototype implemented by David for DDM data. We need to understand whether the way Elasticsearch query are implemented by David is the only possible one, if not evaluate various alternatives comparing the performance. Next meeting is a sort of brainstorming to have a closer look into David queries implementation.
    • Luca and Haddop team will start working on enabling constant input flow to Elasticsearch from Hadoop

05-06-2015:

  • Hassen walked through the code examples. Several questions/comments/suggestions :
    • need to evaluate the best way to interface elsticsearch in the code (pycurl, python clients, etc...)
    • in the query use date histogram which implies time aggregation
  • Next steps:
    • Ivan kindly agreed to setup the UI (with the help of Hassen, looking in his code examples) on top of the Elasticsearch xroootd data
    • Luca and Ivan will clarify questions regarding best format for data insertion
    • Ivan agreed to investigate Search Guard Security plugin, since Elastic Defender won't be supported. This task is lower priority compared to the one of setting the UI

11-06-2015:

  • Eddie: Performed a rolling upgrade of the production and development cluster to version 1.5.2. The process took around 2 hours and 20 minutes.
25-06-2015:
  • Ivan changed the matrix UI view of xrootd to query ES. Result is retrieved from ES very very fast.
  • Pablo worked on logstash to import SSB data - overall experience with logstash is quite good
  • Hassen has modified the FTS UI to query ES, there were some problems with no data being returned
  • Next steps:
    • Ivan will evaluate the security plugin and then work on a prototype for the transfer plots
    • Hassen will work on fixing the issue with no data being returned
02-07-2015:
  • Eddie welcomed Javier as a new member in the team. Javier will work in porting ATLAS job monitoring into ES
  • Ivan ported the transfer plots to ES, pretty fast results
  • Ivan had a look at the security plugin, there are some limitations such as per-index auth. It seems that we need to combine the security plugin with our security mechanism.
  • Hassen has fixed some issues on the FTS UI but unfortunately there are still some bugs here and there
  • Next steps:
    • Ivan will provide an architecture plan for the security in our cluster
    • Hassen will iron out the few remaining bugs
    • Javier will investigate the ATLAS Job Monitoring DB schema to come up with an ES index with all the required information
09-07-2015
  • Pablo: Modified the SSB UI to query ES. You can check the 'ES based UI' section for the links to SSB with ES [ ES based UI]
  • Eddie: Created a new ES user for atlas job monitoring
  • Ivan: Presented an architectural plan for the security mechanism that combines the search guard plugin and our proxy solution. At the moment search guard is quite premature and we should wait a bit before we switch to it.
  • Javier: Created a template for atlas job monitoring, currently working on the panda collector to transform the raw data from the schema of panda to our common job mon schema.
  • Luca: Uthay has started working on feeding live data into ES. One index with 3 different types of data.
  • Next steps:
    • Hassen will iron out the few remaining bugs as he was busy last week
    • Javier will continue working on the transformation of data in the collector
    • Luca will provide an update on Uthay's progress
16-07-2015
  • Eddie: Created a new user account for wdtmon giving access to wdtmon* indices. There was a problem with the search node of the dev cluster due to a full filesystem. Increased the size of the partition.
  • Hassen: Having troubles with the aggregation within elasticsearch. 10 min bins and hourly granularity are exactly the same between ES and Oracle. Weekly and Daily aggregation appears to be wrong in ES.
  • Luca: Uthay created an index for xrootd data. Hopefully by next week we will have live data.
  • Javier: Progressed with the collection and the transformation of panda fields to dashboard fields is finished, there are a couple of small inconsistencies but Javier is working on them. Data are properly inserted in the cluster, and hopefully by next week we will have live data in the cluster.
23-07-2015
  • Eddie & Javier: Most, if not all, of the collectors have been ported into writing in elasticsearch. We now have an index with different types, one to resolve sites and patterns, one for exit codes, one for the site topology, one for jobs and etc. We now have live data and the collectors are running fine. Assuming that there won't be any problem in the collectors by Monday, Javier will start working on the UI to query data from elasticsearch.
  • Luca: Live data for xrootd, everything seems to be working fine except that one of the insertion jobs is taking 7 minutes, this needs to be investigated. Work on the UI will start in August.
  • Eddie asked Ivan to check if the artificial load for the test xrtood index is still running, and if so, to stop it as we have real load now.
30-07-2015
  • Javier & Eddie: Collectors are running constantly, there were some errors that occurred over the last week, worked on fixing them. Defining the strategy for missed data that are not inserting in elasticsearch.
  • Eddie: Created a VM for Javier in order to run the collectors from there instead of his localhost.
06-08-2015
  • Javier & Eddie: Increasing the timeout period in case the cluster is unreachable. Work on the interactive view UI has started, a sample query was created to recreate the landing page results of interactive view
13-08-2015

  • Ivan: Created new account for sdcdam
  • Eddie: Enabled the execution of scripts on the dev cluster as Javier needs to do a calculation between two dates using expressions. Disabled the vulnerable groovy but apparently it is still enabled. We will completely disable the scripts again and do these sort of calculations at the job insertion time.
  • Javier: UI is working, it is really fast. There is a small inconsistency in the total numbers that we will need to investigate why. We now have a month and a half of data (60 GB). The index will be deleted to update the mapping of the template to include the new fields for the calculated values mentioned above and increase the number of shards from 2 to 5. Also, two collectors will run in parallel. One to feed in real time data and one to feed in archive data from the beginning of 2015 or so.
Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng 0.png r1 manage 62.4 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGpng 1.png r1 manage 71.1 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGpng 2.png r1 manage 67.9 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGpng 3.png r1 manage 78.8 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGpng 4.png r1 manage 83.8 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGpng 5.png r1 manage 84.5 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGpng 6.png r1 manage 94.1 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGpng 7.png r1 manage 100.9 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGpng 8.png r1 manage 132.7 K 2015-05-07 - 10:48 IvanKadochnikov ES bulk-update and aggregate-query times with xrootd monitoring data
PNGtiff ESAggLine.tiff r1 manage 33.0 K 2015-05-07 - 14:15 HassenRiahi  
PNGtiff ESAggScat.tiff r1 manage 70.8 K 2015-05-07 - 14:15 HassenRiahi  
Unknown file formatpptx ES_2.pptx r1 manage 1296.4 K 2015-04-02 - 14:55 HassenRiahi  
PNGtiff Matrix613.tiff r1 manage 364.2 K 2015-06-24 - 17:07 HassenRiahi  
PNGtiff MatrixProd.tiff r1 manage 360.0 K 2015-06-24 - 17:07 HassenRiahi  
PNGtiff TransferPlots613.tiff r1 manage 333.1 K 2015-06-24 - 17:07 HassenRiahi  
PNGtiff TransferPlotsProd.tiff r1 manage 350.5 K 2015-06-24 - 17:07 HassenRiahi  
Unknown file formatext result r1 manage 100.3 K 2015-02-11 - 10:04 HassenRiahi  
Edit | Attach | Watch | Print version | History: r73 < r72 < r71 < r70 < r69 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r73 - 2015-08-13 - EdwardKaravakis
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback