WARNING: This web is not used anymore. Please use PDBService.ATLASCondRep instead!
 

ATLAS Conditions Replication

Test Plan

TESTs on replication of COOL data using ATONR as source and INTR as destination RAC
======================================================================

For COOL client is used Stefan Stonjek's client as he puts higher load on the server. He stores open IOVs / Interval of Validity / which leads to many updates in order to 'close' these intervals from the next IOV. Overlapping of IOVs is not allowed in COOL single version tables

The goals of the tests are:

- To understand how COOL does data insertion, is it performant, what are the limits, resource usage and so on...
- Are the Oracle Streams good in replicating COOL data? In production mode we can't accept more than 10s of seconds latency from the ATLAS online RAC 'ATONR' to ATLAS offline RAC.
- To get the feeling at what IOV rate the destination sites will start to fall behind from the online RAC. We are starting from the minimum data insert rate of 3 IOVs per 15 minutes into 250 COOL single version folders.
- To get more knowledge in replicating data and prepare plan with actions when something goes wrong with some of the destination sites.

We consider the following setup to be realistic one as we do not have better estimate for the number of the folders, channels, IOV rate and payload size.

- 250 COOL single version folders
- 200 channels per folder /in reality will vary a lot /
- 100 bytes payload
- insertion rate: should be 3 IOVs with iov_until to infinity every 15 min, but for the tests didn't specify any time constrains, because COOL rather slow in storing new IOVs.
- 10 clients doing the job
- We filter out all DML and DDL activity related to the xxx_IOV_SEQ tables on the source site as we consider that we will not need them on the destination sites. That reduces the number of statements to be applied from the APPLY process and hopefully contributes to the performance.
- With that setup one transaction of inserting one IOV in a folder consists of as many single UPDATEs on IOV_UNTIL column as many channels we have. In our case 200 UPDATEs + 1 BULK INSERT of 200 records. The record length is 150 bytes.

FINDINGS:

===============
17th November
===============

Because of the DIRECT IMPORT to INTR from ATONR and remaping tablespaces
impdp USERID=strmadmin@intr SCHEMAS="ATLAS_COOL_3D" NETWORK_LINK=ATONR.CERN.CH remap_tablespace=ATLAS_COOL_GANCHO_DATA:ATLAS_COOL_GANCHO
FLASHBACK_SCN=5992460472145

in all Tier1 sites I got error "ORA-00959: tablespace 'ATLAS_COOL_GANCHO' does not exist", because into the CREATE TABLE ddl appeared the TABLESPACE clause

ddl: CREATE TABLE "ATLAS_COOL_3D"."TRY_F0002_IOVS"
("OBJECT_ID" NUMBER(10,0),
"CHANNEL_ID" NUMBER(10,0), "IOV_SINCE" NUMBER(20,0), "IOV_UNTIL" NUMBER(20,0),
"USER_TAG_ID" NUMBER(10,0), "SYS_INSTIME" VARCHAR2(255), "ORIGINAL_ID"
NUMBER(10,0), "NEW_HEAD_ID" NUMBER(10,0), "A" NUMBER(10,0), "B" NUMBER(10,0),
"C" NUMBER(10,0), "D" NUMBER(10,0), "E" NUMBER(10,0), "F" NUMBER(10,0), "G"
NUMBER(10,0), "H" NUMBER(10,0), "I" NUMBER(10,0), "J" NUMBER(10,0), "K"
BINARY_FLOAT, "L" BINARY_FLOAT, "M" BINARY_FLOAT, "N" BINARY_FLOAT, "O"
BINARY_FLOAT, "P" BINARY_FLOAT, "Q" BINARY_FLOAT, "R" BINARY_FLOAT, "S"
BINARY_FLOAT, "T" BINARY_FLOAT) PCTFREE 5 PCTUSED 40 INITRANS 1 MAXTRANS 255
NOCOMPRESS LOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS
2147483645 PCTINCREASE 0 BUFFER_POOL DEFAULT) TABLESPACE "ATLAS_COOL_GANCHO" PARTITION BY RANGE ("IOV_SINCE") (PARTITION "MAX_VAL" VALUES LESS THAN
(MAXVALUE) PCTFREE 5 PCTUSED 40 INITRANS 1 MAXTRANS 255 STORAGE(INITIAL 65536
NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1
FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "ATLAS_COOL_GANCHO" NOCOMPRESS
) ENABLE ROW MOVEMENT

To avoid this, the tablespace names of the accounts created at all Tier1s must be the same as these at CERN !!!

===================
03th November 2006 - 66 IOVs per hour
===================

I have repeated the same test as yesterday 2th of November. I got the same picture and the total latency is again 2 hours. The CPU of the INTR2 had been about 75%
The LCRs rate was quite stable 1400 LCRs/sec, except for a 2 and a half hours when the rate had been 600 -700 LCRs!!! I found out that in thet time window 22:00 to 00:15 the job for collecting object statistics had been running. Still is not clear for me the relation of that to the low LCR rate dring that time window.

Again_15hours_APPLY_LCRs_sec_jpg.jpg

Again_15hours_INTR2_CPU_jpg.jpg

===================
02th November 2006 - 66 IOVs per hour
===================

The same as above but for longer time - 50600 seconds took to insert 1000 IOVs /15 hours/
The LCRs rate was quite stable 1400 LCRs/sec, except for a 2 and a half hours when the rate had been 600 -700 LCRs!!! Unfortunatelly for the time of the test we got 2 hours latency. See the snapshots below ...

15hours_20clients_APPLY_rate.tiff 15hours_20 clients_CAPTURE_latency.tiff

I have to repeat the test to see if I will get the same results

===================
01th November 2006
===================

Try very massive load - 250 COOL Folders
20 processes running from 2 "Point 1" machines - 100 IOVs had been inserted for 5700 seconds

The highes LCRs rate I have ever got - 1400 LCRs/sec!!! For the time of the test we got 2 minutes latency

=================
31th  October 2006 -  Idea for solving the APPLY Latency problem
=================

After finding out that having many indexes on the COOL tables slows down the Streams APPLY process at the destinations, Florbela and me started brainstorming to find a solution.

An idea appeared for using 2 DB accounts for each sub-detector - one "xxx_3D" account dedicated only for the Streams flow and the other having Materialized views on the top of the tables in the "xxx_3D" schema.

COOL_replication_with_MVs.jpg

The applications will read the date from the Mat. views, wich will have the full set of indexes. The streams flow will be directly between the xxx_3D accounts, which tables will have only PKs, no other indexes. A job running on regular time will REFRESH the Materialized views on demand.

=================
30th  October 2006
=================

One very important finding was that the latecny is caused by having indexes on the COOL IOV tables. All the 8 APPLY servers on INTR had a lot of I/O because of the necessity to update the relevant indexes. All the wait events "KJC - wait for msg sends to complete" we were observing for quite a lot of time were masking the real problem.

After setting all indexes on the IOVs tables except the PKs as UNUSABLE and starting massive insert for about 20 hours / 250 folders, 10 processes inserting data, Streams rate was about 550 LCRs per sec. / NO LATENCY was observed - only 3-5 seconds between ATONR and INTR. The CAPTURE on ATONR was never paused for flow control.

The disabled INDEXES were of the following type:

xxx_IOVS_CO_2INDX based on (channel_id, object_id)
xxx_CSU_3INDX based on (channel_id, iov_since, iov_until)
xxx_SU_2INDX based on (iov_since, iov_until)

========================
26th - 27th  October 2006
========================

Because of the many wait events of type 'KJC: wait for msg sends to complete' from the Streams APPLY servers we stopped INTR1
With only INTR2 working we do not see such wait events.
The outcome of the tests represented below is that APPLY process keeps up to higher insert speed when there is only Primary Keys on the COOL tables. If all 4 indexes are enabled then we got many I/O and the latency grows a lot.

Try 1.

500 COOL single version folders
20 processes inserting data
==========================
No wait events but a lot of I/O , for 50 minutes massive insertion we got 23 minutes latency.
Avg. rate 550 LCRs/sec

Try 2.

350 COOL folders
14 processes inserting data
==================
The same results as above - 50 minutes running, 23 minutes latency
Avg. rate 500 LCRs/sec

Try 3.

250 COOL folders
10 processes inserting data
==================
90 minutes insertion - 15 minutes latency
Avg. rate 400 LCRs/sec
Insertion of 20 folders with 40 IOVs took about 4000 sec, means 5 sec per IOV per folder
For all clients, means 2 transactions per second.

Try 4.

SET ALL COOL INDEXES without the PKs at the destination as UNUSABLE
200 COOL folders
10 processes inserting data
=======================
No latency - only 3-5 seconds. Including 100 COOL tables with usable indexes into the test caused immediate increase of the latency
Avg. rate 600 LCRs/sec

Try 5 .

350 COOL tables
15 processes inserting data
=====================
30 minutes running - 7 minutes latency
No paused for flow control, but many EVALUATING RULE
Avg. rate 600 LCRs/sec

Try 6.

SIMPLIFIED CAPTURE RULES / Filter /
350 COOL tables
15 processes inserting data
=====================
80 minutes running - 3 minutes latency
Avg. rate 800 LCRs/sec

Try 7.

ATONR1 dedicated only to the CAPTURE and PROPAGATION
All COOL clients sessions are on Instance 2 - ATONR2 ,I decreased the STREAMS POOL on ATONR1 from 640 MB to 250 in order to have more memory for the Buffer pool - 1440 MB
350 COOL tables
15 processes inserting data
=========================
90 minutes running - No latency - only 3-5 sec
Avg. rate 800 - 900 LCRs/sec
The CPU utilization on ATONR1, where the CAPTURE is running, is about 70 %

Try 8. Activate all COOL indexes - rebuild and collect statistics for them
8 hours data insertion - 240 IOVS
250 COOL tables
10 processes inserting data
========================
Very big latency - after the 4-th hour, about 2500 seconds. At the end it finished with 9000 seconds latency

12_hours_test_APPLY_latency.tiff 12_hours_test_APPLY_LCRs_per_sec.tiff

=================
25th October 2006
=================

2. We have been monitoring very bad performance of the APPLY process on INTR. It starts to fall behind immediately after starting new insertion. We can see "KJC: wait for msg sends to complete" wait event for the all 8 APPLY servers.

APPLY_servers_activity.tiff

- With single APPLY process (no parallelism) the KJC wait event is missing. Another interesting finding is that the KJC wait event is present when both instances are up and not with single running instance. As Florbela pointed out, that might be an Oracle bug with number 4894027.
Its description from the Metalink is the following:
" In a RAC environment Logical Standby or Streams apply can show excessive waits on 'rdbms ipc message' when there is workload to apply"
That will be fixed into next Oracle patch set 10.2.0.3. Hopefully it will be issued soon and will contribute to fix the problem.

=================
20th October 2006
=================

1. COOL insertion performance
The first impression is that COOL is not performant in the data insertion. As average, to store 1 IOV per folder takes about 5 sec. With 10 clients inserting into 250 folders /each client dealing with 25 folders / takes 2 minutes.
What I find is that COOL issues 3 SQL statements per channel in order to close the IOV. That is quite inefficient
Here is an extract from the COOL client log
------------------------------------------------------

Prepared statement : "SELECT MAX("OBJECT_ID")
FROM ATLAS_COOL_3D."TEST3D_F0001_IOVS" "TEST3D_F0001_IOVS"
WHERE "CHANNEL_ID"= :"channel""
Fetches object row from table TEST3D_F0001_IOVS for object_id = xxx
Prepared statement : "SELECT "OBJECT_ID",
"CHANNEL_ID",
"IOV_SINCE",
"IOV_UNTIL",
"USER_TAG_ID",
"SYS_INSTIME",
"ORIGINAL_ID",
"NEW_HEAD_ID",
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T"
FROM ATLAS_COOL_3D."TEST3D_F0001_IOVS" "TEST3D_F0001_IOVS"
WHERE "OBJECT_ID"= :"objectId""
Fetches object row from table TEST3D_F0001_IOVS for object_id = xxx

Update IOV in table row for object 136601
Prepared statement : "UPDATE ATLAS_COOL_3D."TEST3D_F0001_IOVS"
SET "IOV_UNTIL"= :"until"
WHERE "OBJECT_ID"= :"objId""

These 3 statements are issued 200 times for the all 200 channels up

[SUGGESTION]

More elegant solution is to replace all these 400 SELECT by only 1 !!!

SELECT object_id, channel_id, iov_since, iov_until
FROM atlas_cool_3d.test3d_f0014_iovs
WHERE object_id IN
( select max(object_id) from atlas_cool_3d.test3d_f0014_iovs GROUP BY channel_id)

If the case is to set the same IOV_UNTIL for all channels ( this is too simplied case, maybe doesn't corrspond to the COOL logic ),
than ALL 600 statements will be replaced by only 1 as the following.

UPDATE atlas_cool_3d.test3d_f0014_iovs SET IOV_UNTIL = xxx
WHERE (object_id,channel_id) IN
( SELECT object_id, channel_id
FROM atlas_cool_3d.test3d_f0014_iovs
WHERE object_id IN ( SELECT MAX(object_id) FROM atlas_cool_3d.test3d_f0014_iovs GROUP_BY channel_id))
[SUGGESTION END]

We were surprised to hear that Persistency does not support GROUP BY operator, but they promised to add it in their API within the next 2 weeks.

Sites Connected

Sites to be joining

Topic attachments
I Attachment History Action Size Date Who Comment
PNGtiff 12_hours_test_APPLY_LCRs_per_sec.tiff r1 manage 193.6 K 2006-10-30 - 00:06 GanchoDimitrov  
PNGtiff 12_hours_test_APPLY_latency.tiff r1 manage 76.7 K 2006-10-30 - 00:05 GanchoDimitrov  
PNGtiff 15hours_20_clients_CAPTURE_latency.tiff r1 manage 28.9 K 2006-11-02 - 14:54 GanchoDimitrov  
PNGtiff 15hours_20clients_APPLY_rate.tiff r1 manage 58.4 K 2006-11-02 - 14:33 GanchoDimitrov  
PNGtiff APPLY_servers_activity.tiff r1 manage 224.8 K 2006-10-29 - 23:54 GanchoDimitrov The KJC wait events, represented in pink
JPEGjpg Again_15hours_APPLY_LCRs_sec_jpg.jpg r1 manage 159.7 K 2006-11-05 - 22:42 GanchoDimitrov  
JPEGjpg Again_15hours_INTR2_CPU_jpg.jpg r1 manage 151.1 K 2006-11-05 - 22:43 GanchoDimitrov  
JPEGjpg COOL_replication_with_MVs.jpg r2 r1 manage 112.6 K 2006-11-05 - 22:23 GanchoDimitrov  
PDFpdf COOL_replication_with_MVs.pdf r1 manage 38.8 K 2006-11-02 - 17:57 GanchoDimitrov  
JPEGjpg gancho.jpg r1 manage 12.1 K 2006-11-05 - 21:46 GanchoDimitrov  
Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2010-06-11 - PeterJones
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PSSGroup All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback