TWiki> Persistency Web>Cool>CoolPerformanceTests (revision 18)EditAttachPDF

COOL Performance Tests

A performance test suite has been developed over time to test and validate the performance of COOL queries against Oracle database servers.

This test suite can be useful for instance to exclude performance regressions when moving to a more recent version of the COOL client software or of the Oracle server software (e.g. for the port to Oracle 11g servers described in task #23366 and more recently for the move to Oracle 12c servers as described in task #44885).

The most recent version of the test suite (built around the script createReport.sh) systematically tests the most important use cases for retrieving conditions data using COOL. The use cases that were considered in the 10g to 11g migration are the following:

  • SV_R. Single version retrieval (with inline payload).
  • SP_R. Single version retrieval (with a separate payload table).
  • MVUR. Multi version retrieval from a user tag (with inline payload).
  • MPUR. Multi version retrieval from a user tag (with a separate payload table).
  • MVHR. Multi version retrieval from the HEAD (with inline payload).
  • MPHR. Multi version retrieval from the HEAD (with a separate payload table).
  • MVTR. Multi version retrieval from a normal tag (with inline payload).
  • MPTR. Multi version retrieval from a normal tag (with a separate payload table).
  • SC_R. Single version retrieval (with inline CLOB payload).

As described in the proceedings of the NSS 2008 conference in Dresden, the test suite aims at validating two specific aspects of COOL query performance:

  • Scalability. Rather than testing query times in absolute terms, the test suite focuses on estimating the scalability of COOL query performance as the data volumes increase. The basic idea is that the retrieval of IOVs valid in a given time range should be optimized so that query response time is not only as low as possible, but it also remains essentially constant when retrieving IOVs from very old or very recent time ranges. In each test, query times are measured for different values of the validity time T around which IOVs are retrieved; these query times are then plotted as a function of T.
  • Stability of query execution plans. Having identified statistics reliability and bind variable peeking as the two main factors for SQL query stability, their effect is now systematically studied in all COOL performance tests. For each test, query times are measured and plotted as a function of T in six different configurations: for two scenarios of bind variable peeking (low or high values of T) and for three scenarios for statistics (reliable statistics, unreliable statistics computed on empty tables, no statistics). Generally speaking, good performance can be expected 'out-of-the-box' (without adding optimizer hints) only for the scenario with reliable statistics and high values of peeked bind variables, while the other 5 cases can all lead to non-scalable execution plans. The test suite produces two such sets of six curves, both without hints (out of the box from Oracle) and with hints (the COOL mechanism for stabilizing the execution plans). If all is ok, the six curves with hints should all be flat.

The test suite is also designed to ensure the reproducibility of test results. When creating the full performance report, test tables are created and populated from scratch before query times are measured. When debugging a specific problem observed in the test suite, it is however more efficient to create the test data only once and then concentrate only on testing query times. A few more issues have been observed in the past to lead to confusing results and should be considered when interpreting surprising results (some of these are now taken care of inside the test suite and/or inside the COOL code).

  • Oracle cache and I/O. The script by default launches a dry run of the test (in the 6 scenarios without hints, which need to scan more data) to fill all relevant data caches before executing the 12 reference tests (6 with hints and 6 without hints). This should remove all I/O effects and ensure a better reproducibility of the measured CPU performance in the tests.
  • Load on the test server and spurious processes. You should possibly execute the tests on a server with negligible load from other database users. If the plots you obtain still show some amount of randomness, check the CPU load on the server; in the past, one such observation was caused by an Oracle Enterprise Manager process running wild, which had to be killed and restarted.
  • Overhead from creating the 10053 trace file. In each of the 12 scenarios, the test forces Oracle to execute a hard parse (to recompute the appropriate execution plan) and produce a 10053 trace file. This is achieved by executing a dummy DDL statement on the relevant tables. This hard parse is time consuming and causes the test client to observe a very long retrieval time. To be independent of this overhead, this first test is discarded from the plots.
  • Automatic computation of statistics. When executing the script for the first time to create the test databases, the statistics of all relevant tables are now locked by the test script. This should ensure that the reference tables with no statistics or bad statistics permanently remain in that state, even if statistics are automatically collected during the night every 24h.
  • Oracle 11g SQL plan baselines. The Oracle 11g "SQL plan baseline" feature is disabled in CORAL, and a fortiori in COOL, by default (by issuing the relevant "alter session" statement at logon in CORAL), because it was found to lead to unstable and unexpected execution plans during the tests.
  • Oracle 11g adaptive cursor sharing. The Oracle 11g "adaptive cursor sharing" feature is disabled in COOL for the main IOV queries (by adding a NO_BIND_AWARE hint), because it was found to lead to unstable and unexpected execution plans during the tests.
  • Oracle 11g cardinality feedback. The Oracle 11g "cardinality feedback" feature was briefly tested in May 2012 (well after the validation of COOL performance on 11g servers) because it was suspected to be the cause of 11g server memory fragmentation leading to ORA-04031 errors in the COOL and CORAL nightlies (see bug #94270). However, it was eventually decided to keep this feature untouched (i.e. enabled by default in 11g) in CORAL and COOL.
  • Oracle 11g and 12c dynamic sampling. The Oracle 11g "dynamic sampling" feature has been disabled by default in the CORAL and COOL functional test suite using qmtest in May 2012 (well after the validation of COOL performance on 11g servers) because it was found to be the cause of 11g server memory fragmentation, through extra queries with OPT_DYN_SAMP hints, leading to ORA-04031 errors in the COOL and CORAL nightlies (see bug #94270). However this feature is only disabled for qmtest and is kept enabled by default for all other use cases, including the performance tests. In 12c, in particular, it was found in August 2013 that it is better to keep it untouched (i.e. enabled by default) for the performance tests (task #44885).
  • Oracle 12c adaptive optimization. The Oracle 11g "adaptive optimization" feature has been disabled by default in CORAL in August 2013 during the COOL query performance validation on Oracle 12c (bug #102272). This was causing extra queries with OPT_ESTIMATE hints to appear in the trace files of the COOL performance test suite. It was also observed to lead to worse performance in the absence of COOL hints.

Performance validation for the Oracle 10g to 11g migration

The test script has been significantly improved to produce an automated performance report. This shows the numerical results of the tests on a plot, as well as selected information extracted from the 10053 trace file. The following are the plots produced in December 2011 during the comparison of COOL performance on Oracle 10g and Oracle 11g (see task #23366).

For each use case, there are essentially only three points that need to be checked:

  • The 6 plots with hints should display a constant retrieval time. This ensures scalability of COOL performance as the tables get bigger. This must be visually inspected (there is no automatic fit for the data in the plots).
  • The 6 trace files with hints should all use the same execution plan. This is displayed in the summary table for each use case. The relevant information is surrounded by a green box if all is ok. Note that this is correlated with the observation of the plots: if the execution plans are different, you may expect that one of the plot will disaply a different pattern, most likely with an increasing query time (non-scalable behaviour).
  • There should be no unused hints in the 6 trace files with hints. If this should happen (quite unlikely), it would be an indication that the COOL hints need to be changed because they are no longer appropriate.

The following are the reports for the SV_R use case only, comparing Oracle 10g (10.2.0.5) with two different versions of Oracle 11g (11.2.0.2 and 11.2.0.3). The report for 11.2.0.2 clearly shows some problems, which are due to Oracle bug 10405897, as described in task #23366.

  • SV_R-10.2.0.5-full.pdf: Performance report for the SV_R use case, against Oracle 10.2.0.5. Full version (all execution plans).
  • SV_R-11.2.0.2-full.pdf: Performance report for the SV_R use case, against Oracle 11.2.0.2. Full version (all execution plans).
  • SV_R-11.2.0.3-full.pdf: Performance report for the SV_R use case, against Oracle 11.2.0.3 Full version (all execution plans).

The following are the reports for all 9 use cases, comparing the reference versions of Oracle 10g (10.2.0.5) and 11g (11.2.0.3). Two versions of each report are attached: a short one showing only the best execution plan (that used by the 6 scenarions with hints) for each use case, and the full one showing all execution plans for each use case (i.e. all the different bad execution plans which would be observed if there were no hints in COOL).

  • ALL-10.2.0.5.pdf: Performance report for all 9 use cases, against Oracle 10.2.0.5. Short version (only the best execution plan for each use case).
  • ALL-10.2.0.5-full.pdf: Performance report for all 9 use cases, against Oracle 10.2.0.5. Full version (all execution plans for each use case).
  • ALL-11.2.0.3.pdf: Performance report for all 9 use cases, against Oracle 11.2.0.3 Short version (only the best execution plan for each use case).
  • ALL-11.2.0.3-full.pdf: Performance report for all 9 use cases, against Oracle 11.2.0.3. Full version (all execution plans for each use case).

Note that for each use case, the best execution plan is different in 10g and 11g, even if both Oracle versions lead to the same good performance in the plots. I actually believe that the core of the algorithm is the same in 10g and 11g (based on the FIRST ROW and INDEX RANGE SCAN (MIN/MAX) steps within the correlated subquery), whereas what differs in 11g from 10g are only minor details of the xecution plans, or even only the way the same plan is described in the plan table. I actually find the execution plan description in 11g more understandable than the one in 10g: in my opinion only one MIN/MAX scan is needed in the query, as described in the 11g plan, because there is only one MAX clause (in the correlated subquery), whereas the plan table description in 10g mentions two separate MIN/MAX scans.

Performance validation for the Oracle 11g to 12c migration

After a few modifications to the test script and CORAL/COOL code (see task #44885), the performance report has also been produced for Oracle 12c servers.

The following are three reports for the SV_R use case only, comparing Oracle 12c (12.1.0.1) with and without adaptive optimization enabled. Two reports are listed with adaptive optimization enabled: a new report was uploaded because the reporting tool was slightly modified in the meantime (to clearly mark in the report that adaptive optimization is enabled, as well as to highlight in red any unexpected bind variable values), but the older report was also kept because the results were slightly different (which is an indication that adaptive optimization may lead to not clearly reproducible results).

The conclusions from these three reports are the following:
  • Performance for SV_R on 12c is always good when hints (those developed on 10g and 11g) are provided, independently of statistics and bind variables.
  • When hints are not provided, performance is good as Oracle also finds good execution plans if statistics are up to date and bind variable peeking is favorable, as long as adaptive optimization is disabled.
  • Without hints, performance was always bad in the first test with adaptive optimization enabled, even with good statistics and favorable bind variables. This is a first solid argument for disabling adaptive optimization by default in CORAL (it can be reenabled using an environment variable, as done to produce these plots).
  • In particular, the reason why performance is bad with adaptive optimization may be that, as seen in the table, adaptive optimization seems to choose its execution plan using "low" (unfavorable) values of the bind variables even if the query is executed with "high" (favorable) bind values. In the second report this is explictly highlighted with a red box when "peekhi" and "peeklo" queries use the same bind variable values. Remember also that with adaptive optimization the trace file contains two SQL queries, the first one is the standard parse with the correct bind variables, while the second one is the OPT_ESTIMATE adaptive optimization query, which may use incorrect values of the bind variables.
  • The second test with adaptive optimization gave different results from the first test. Without hints, some of the peekhi queries did use the correct "high" bind values, finding the correct execution plan, but one did not. In other words, adaptive optimization seems to decrease the reproducibility and predictability of query behaviour. This is a second solid argument for disabling this feature by default in CORAL. Note that the report also highlights with red boxes values of the bind variables which are different within each set of files that should use the same "peekhi" or "peeklo" values.

These tests were then repeated for all 9 use cases for Oracle 12c (12.1.0.1), leading to the following reports. Only the full version of each report is attached, showing all execution plans for each use case (i.e. all the different bad execution plans which would be observed if there were no hints in COOL).

The conclusions from these two reports are the following:

The report indicates that the COOL hints developed for Oracle 10g seem to work well and ensure adequate performance also on Oracle 12c. If Oracle 12c adaptive optimization is disabled, . It is observed however that these plans are in some cases slightly different from those enforced by COOL hints: this indicates that it may be appropriate to review and update COOL query hints for Oracle 12c.

For the MVUR use case, without hints Oracle was initially suggesting a plan almost identical to that triggered by COOL hints, execpt that a simpler 3-D index was used instead of the hinted 5-D index. It was understood that this is due to the presence of too few user tags in the test schema, making the two plans essentially equivalent. This was fixed by improving the test schema, adding several more user tags (see task #4485 and lcgcool:18643).

-- AndreaValassi - 22-Aug-2013

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf ALL-10.2.0.5-full.pdf r1 manage 1970.0 K 2011-12-15 - 18:04 AndreaValassi Performance report for all 9 use cases, against Oracle 10.2.0.5. Full version (all execution plans for each use case).
PDFpdf ALL-10.2.0.5.pdf r1 manage 1855.8 K 2011-12-15 - 18:03 AndreaValassi Performance report for all 9 use cases, against Oracle 10.2.0.5. Short version (only the best execution plan for each use case).
PDFpdf ALL-11.2.0.3-full.pdf r1 manage 2006.6 K 2011-12-15 - 18:05 AndreaValassi Performance report for all 9 use cases, against Oracle 10.2.0.5. Full version (all execution plans for each use case).
PDFpdf ALL-11.2.0.3.pdf r1 manage 1889.9 K 2011-12-15 - 18:05 AndreaValassi Performance report for all 9 use cases, against Oracle 11.2.0.3. Short version (only the best execution plan for each use case).
PDFpdf ALL-12.1.0.1-full.pdf r1 manage 1397.4 K 2013-08-22 - 10:44 AndreaValassi Performance report for all 9 use cases, against Oracle 12.1.0.1. Full version (all execution plans for each use case).
PDFpdf ALL-12.1.0.1-withAdaptiveOptimization-full.pdf r1 manage 1352.2 K 2013-08-22 - 23:04 AndreaValassi Performance report for all 9 use cases, against Oracle 12.1.0.1 with adaptive optimization enabled. Full version (all execution plans for each use case).
PDFpdf SV_R-10.2.0.5-full.pdf r1 manage 264.2 K 2011-12-15 - 18:06 AndreaValassi Performance report for the SV_R use case, against Oracle 10.2.0.5. Full version (all execution plans).
PDFpdf SV_R-11.2.0.2-full.pdf r1 manage 297.3 K 2011-12-15 - 18:06 AndreaValassi Performance report for the SV_R use case, against Oracle 11.2.0.2. Full version (all execution plans).
PDFpdf SV_R-11.2.0.3-full.pdf r1 manage 243.5 K 2011-12-15 - 18:07 AndreaValassi Performance report for the SV_R use case, against Oracle 11.2.0.3. Full version (all execution plans).
PDFpdf SV_R-12.1.0.1-full.pdf r1 manage 200.1 K 2013-08-16 - 17:57 AndreaValassi Performance report for the SV_R use case, against Oracle 12.1.0.1. Full version (all execution plans).
PDFpdf SV_R-12.1.0.1-withAdaptiveOptimization-full-old.pdf r1 manage 190.4 K 2013-08-23 - 11:56 AndreaValassi Performance report for the SV_R use case, against Oracle 12.1.0.1. Full version (all execution plans). Earlier test (16 Aug 2013).
PDFpdf SV_R-12.1.0.1-withAdaptiveOptimization-full.pdf r1 manage 210.1 K 2013-08-22 - 23:01 AndreaValassi Performance report for the SV_R use case, against Oracle 12.1.0.1 with adaptive optimization enabled. Full version (all execution plans). Most recent test (22 Aug 2013).
Edit | Attach | Watch | Print version | History: r22 | r20 < r19 < r18 < r17 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r18 - 2013-08-24 - AndreaValassi
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Persistency All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback