Investigations on a Relational Schema based Information System

This page is intented to describe the investigations which have been done for switching the current LDAP based data model of the EGEE Information System to a relational data model.

The target is to compare the both systems in terms of scalability and request handling speed.

The results will influence the future EGEE Grid Information System developments.

Transformation of the existing information

The information model for the BDII is currently (Sept. 2007) only defined for the LDAP protocol. Hence, it needs to be transformed from its hierachical structure to a 'flat' relational one. For this, each entry had to be parsed, its content mapped to tables and subsequently stored in a relational database. This mapping of the data is the most difficult part since it determines how the data is stored later on and therefore also determines the readout speed when the data is requested.

To start with this, the R-GMA configuration file (gin.conf) has been modified and adapted to the GLUE 1.3 schema (however, it is NOT compatible anymore with R-GMA). For the time being it is not necessary to create for each objectclass in the LDAP protocol a table. A LDIF entry is mapped to a table and its attributes mapped to its columns.

The multi valued attributes in some entries need to be stored in a seperate tables to normalise the database structure as much as possible. A ValueTable and TypeTable have been therefore introduced. The type (name) of the attribute is stored in the TypeTable, the value in the ValueTable. The ValueTable keeps a foreign key relation to the TypeTable to hold the information what type this value is. Again, it references by an ID to a parent in the other tables (GlueCE, GlueSE, etc.) From the query investigations from SlapdIndexedDB it didn't seem to be fisible to store the parent type table with the value.

Difficulties:

  • hierachial structure to 'flat' relational structure
  • UniqueIDs are sometimes not given
  • Key referrals are not always consistent with related entries ( A entry refers with a ForeignKey to another entry, but with wrong key)
Below you can find a simplified schema, which shows the ValueTable, TypeTable and parent entry relationship:

schema_simplified.png

(please see a picture of the full schema attached)

The Tool

To do the transformation a perl script has been developed. Its source and documentation can be found in the glite cvs directory.

It reads a LDIF file in LDAP V3 format, tries to map the entries and attributes to tables and columns, fills the values into a hash arrays and finally inserts the data into the database. With this a structure validation tool is available too (however, it does NOT check the content, this is left to the infromation providers). For each problem a message is printed with the cause. If written to a file these messages can be summarised using another script. An output would look like this :

Stats:
=====================
Parsed LDIF entries  :  32376   (13.304646 seconds)
Inserted DB records  :  109892  (793.783362 seconds)
Refused LDIF entries :  2
Ignored values       :  4024

Error Classification:
=====================
      1 ORA-12899: value too large for column "LCG_FELIXEHM"."GLUELOCATION"."UNIQUEID" (actual: 265, maximum: 255) (DBD ERROR: error possibly near <*> indicator at char 99 in 'INSERT INTO GlueLocation (ID, LocalID
, Name, Path, UniqueID, Version) VALUES (:p1, :p2, :p3, :p4, :<*>p5, :p6)')
      1 ORA-12899: value too large for column "LCG_FELIXEHM"."GLUELOCATION"."NAME" (actual: 115, maximum: 100) (DBD ERROR: error possibly near <*> indicator at char 89 in 'INSERT INTO GlueLocation (ID, LocalID, Na
me, Path, UniqueID, Version) VALUES (:p1, :p2, :<*>p3, :p4, :p5, :p6)')

Duplicated definition of the same attribute within one entry:
=============================================================
   1053 GlueCEUniqueID
   1053 Accesspoint
    301 Status
    288 GlueSEUniqueID
    160 Type
    115 Port
    114 Version
    111 Endpoint
     48 Path
     47 Root
     47 Quota
     47 MinFileSize
     47 MaxPinDuration
     47 MaxNumFiles
     47 MaxFileSize
     47 MaxData
     47 FileLifeTime
     44 FreeJobSlots
     36 TotalCPUs
     35 LRMSVersion
     34 WaitingJobs
     34 TotalJobs
     34 RunningJobs
     22 Priority
     22 AssignedJobSlots
     21 LRMSType
     15 UsedSpace
     15 AvailableSpace
     14 MaxWallClockTime
     14 MaxTotalJobs
     14 MaxRunningJobs
     14 MaxCPUTime
     14 FreeCpus
      4 DefaultSE
      4 ApplicationDir
      3 Name
      3 InformationServiceURL
      2 SizeTotal
      2 SizeFree
      2 LocalID
      2 Architecture
      1 PhysicalCPUs

http://glite.cvs.cern.ch/cgi-bin/glite.cgi/glite-info-tools/ValidationAndDBInsert/

Some handmade checks

Find all SEs where the GlueSEUniqueID is not unique:
ldapsearch -x -LLL -h lcg-bdii:2170 -b o=grid objectClass=GlueSE GlueSEUniqueID | grep '^Glue*' | sort | uniq -c | awk '$1 > 1 {print}'
Show, which attributes are published per objectClass in the Information System
# Example for a GlueSA entry:
ldapsearch -x -h lxn1185:2170 -b o=grid objectClass=GlueSA | grep '^Glue.*' |sed 's,\(.*\): .*,\1,' | sort | uniq

Insertion

Below is an output of the tool, when processing a full dump from lcg-bdii and inserting into a local MySQL database.
===>  GlueCESEBindGroup : 6050 values in 3025 rows took 0.987007sec 
===>  GlueSEControlProtocol : 2052 values in 342 rows took 0.158509sec 
===>  GlueVOInfo : 60 values in 12 rows took 0.005876sec 
===>  TypeTable : 42 values in 21 rows took 0.006758sec 
===>  GlueSubCluster : 7881 values in 375 rows took 0.32402sec 
===>  GlueLocation : 48606 values in 8102 rows took 4.032659sec 
===>  GlueSA : 52093 values in 3480 rows took 2.353877sec 
===>  GlueSEAccessProtocol : 6758 values in 978 rows took 0.485846sec 
===>  GlueCluster : 1477 values in 371 rows took 0.151308sec 
===>  GlueCESEBind : 20908 values in 5243 rows took 2.191324sec 
===>  GlueSL : 20 values in 5 rows took 0.002251sec 
===>  GlueCEVOView : 53208 values in 4771 rows took 3.031827sec 
===>  GlueCE : 81774 values in 3012 rows took 2.980748sec 
===>  GlueSE : 2680 values in 340 rows took 0.169503sec 
===>  GlueSite : 2892 values in 263 rows took 0.155399sec 
===>  GlueService : 16377 values in 1585 rows took 1.067189sec 
===>  ValueTable : 246279 values in 82093 rows took 34.992386sec 
Parser : Parsed 33968 entries in 46.335935 sec.
SQL    : Inserted 114018 rows in 53.096487 sec.

The Tests

The next step is to check the speed of the relational DB with the obove mentioned schema and to compare it with the OpenLDAP tests which have been done before. In respect of possible future use it was also interesting which implementation of the client would be the fastest. PERL, Python and a C++ (based on the Persistency framework http://pool.cern.ch/coral/) were compared. All three implementation models give us the possiblity of changing the underlying DB architecture without major code changes. MySQL and Oracle as the most common relational databases were used as candidates for this 'Client Implementation Speed Test'.

In the second part (Full Test) we use the cognitions from the Client test run the testsuite () against the database instances.

Client Implementation Speed Test

All three client issued the same SQL query to the database:
SELECT * FROM GlueSA, ValueTable, TypeTable 
WHERE VALUETABLE.PARENTID=GLUESA.ID 
  AND VALUETABLE.TYPEID = TYPETABLE.TYPEID 
  AND TYPETABLE.NAME ='GlueSAAccessControlBaseRule' 
  AND VALUETABLE.VALUE='lhcb'
The corresponding LDAP query:
ldapsearch -x -LLL -h lcg-bdii:2170 -b o=grid '(&(objectClass=GlueSA)(GlueSAAccessControlBaseRule=lhcb))'

("Find the SA which provides a certain feature")

Oracle Dev DB
ClientComparison.png

Requests going through Persistency take much more time than Perl or Python. This is due to the fact that Persistency loads the required libraries first. The graph shows how much time is spent if the SEAL, BOOST, Xerces and other libs are loaded from AFS in comparison if those are loaded locally from the machine. All client implementations use the Oracle libs available from AFS. Still, Perl and Python are around 50% faster than CORAL. The LDAP query result time in the graph cannot really be compared to the previous mentioned implementations since the Oracle DB instance is only a developer one and the LDAP query had been issued against the official load balanced BDII with high performance machines.

MySQL 5 DB
ClientComparison_mixed.png

The graph obove shows the client execution time results using a MySQL (version 5.0.45) database and a native LDAP client. The OpenLDAP (version 2.2.13) and MySQL server run on the same hardware (SLC 4.5, Xeon 2.4Ghz, 1GB Memory, http://lemonweb.cern.ch/lemon-status/info.php?host=lxn1185 ).

The CORAL, Python and PERL client execution times are very close and make nearly not difference within each other. Still, the OpenLDAP native client is the fastest. If the same LDAP query is made through a PERL script the result is around 25 times and the Python version about 4 times slower than the native client written in C.

Full Test

This section shows the result from a full test run. In detail this means:
  • 9 Clients issuing requests through PERL DBI
  • 10 parallel requests / machine => 90 in total
  • relational and LDAP DB contain the same information
  • burst length 60 seconds
  • Oracle Client libs loaded from AFS
  • MySQL libraries loaded locally

Please note : Previous tests used different LDAP queries. Therefore the graphs for the OpenLDAP test shown below look different than the ones show in SlapdDualCoreTest . Also the number of parallel requests have been changed from max 60 (before) to 90.

SQL Queries:
1. select * from GlueCluster, ValueTable where GlueCluster.uniqueid = 'obsauvergridce01.univ-bpclermont.fr' and ValueTable.parentid = GlueCluster.ID
2. select * from GlueSE,ValueTable where uniqueid = 'lcgrid.dnp.fmph.uniba.sk' and ValueTable.parentid = GlueSE.ID
3. select * from GlueService,ValueTable where uri = 'httpg://dpm.cyf-kr.edu.pl:8443/srm/managerv1' and type='srm_v1' and ValueTable.parentid = GlueService.id
4. select * from GlueSA, ValueTable, TypeTable where ValueTable.parentid=GlueSA.ID and ValueTable.typeid = TypeTable.typeid and TypeTable.name='GlueSAAccessControlBaseRule' and ValueTable.value='lhcb'
5. select * from GlueCE, ValueTable, TypeTable where GlueCE.id = ValueTable.parentid and ValueTable.typeid = TypeTable.typeid  and TypeTable.name = 'GlueCEAccessControlBaseRule' and ValueTable.value='VO:dteam'
6. select * from GlueCESEBind
7. select * from GlueSubCluster
8. select * from GlueService, ValueTable, TypeTable where GlueService.id = ValueTable.parentid and ValueTable.typeid = TypeTable.typeid  and TypeTable.name = 'GlueServiceAccessControlRule' and ValueTable.value='dteam'
9.select * from GlueCEVOView, ValueTable, TypeTable where GlueCEVOView.id = ValueTable.parentid and ValueTable.typeid = TypeTable.typeid  and TypeTable.name = 'GlueCEAccessControlBaseRule' and ValueTable.value = 'VO:unosat' and GlueCEUniqueID = 'ce123.cern.ch:2119/jobmanager-lcglsf-grid_unosat'
LDAP Queries
1. /usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(objectClass=GlueCluster)(GlueClusterUniqueID=obsauvergridce01.univ-bpclermont.fr))
2. /usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(objectClass=GlueSE)(GlueSEUniqueID=lcgrid.dnp.fmph.uniba.sk))
3. /usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(&(objectClass=GlueService)(GlueServiceURI=httpg://dpm.cyf-kr.edu.pl:8443/srm/managerv1))(GlueServiceType=srm_v1))
4. /usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(objectClass=GlueSA)(GlueSAAccessControlBaseRule=lhcb))
5. /usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(objectClass=GlueCE)(GlueCEAccessControlBaseRule=VO:dteam))
6. /usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(objectClass=GlueCESEBind))
7. /usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(objectClass=GlueSubCluster))
8. /usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(GlueServiceType=*)(GlueServiceAccessControlRule=dteam))
9./usr/bin/ldapsearch -x -l 15 -h lxn1185 -b o=grid (&(&(objectClass=GlueVOView)(GlueChunkKey=GlueCEUniqueID=ce123.cern.ch:2119/jobmanager-lcglsf-grid_unosat)(GlueCEAccessControlBaseRule=VO:unosat)))

MySQL 4.1.2

MULTI_MYSQL4_9C_10T_TimeThreads.png
MySQL with memory stored tables

MySQL offers the possiblity to store tables in memory which has the advantage of being very fast. However, since this is limited to the system's available memory huge tables cannot be stored. Here, all tables EXCEPT the ValueTable (100K rows) were stored in memory. The disadvantage is that whenever the server is stopped the data (not the schema) is lost.

MULTI_MYSQL4_9C_10T_IN_MEMORY_TimeThreads.png

Oracle 10.2

MULTI_ORACLE_9C_10T_TimeThreads.png

OpenLDAP 2.2

MULTI_LDAP_9C_10T_TimeThreads.png

Average Response Time Test

Below you'll find graphs which show the average response time behaviour of clients requesting data from MyLSQ, Oracle and OpenLDAP.

Technical details :

  • MySQL and OpenLDAP run on a Xeon 2.4 GHz, 1GB memory
  • The Oracle RAC (Real Application Cluster) instance consits of two nodes with a replicated database.

The resultset of the relational schema and the OpenLDAP schema differs in terms of returned data size.

As seen from the legends in the graphs OpenLDAP returns for each entry all descriptional information (attribute names), too. This increases the total size of the returned data by a factor of ~4 compared to the resultset of the relational database. The only way to avoid this, is by changing the requested data. But the target is to know how fast those systems return information a user requests (which is the idea of this test !) this fact should not be avoided.

Example (Test Query 1):

/usr/bin/ldapsearch -x -LLL -l 15 -h lxn1185:2170 -b o=grid '(&(objectClass=GlueSE)(GlueSEUniqueID=lcgrid.dnp.fmph.uniba.sk))'

# extended LDIF
#
# LDAPv3
# base  with scope sub
# filter: (&(objectClass=GlueSE)(GlueSEUniqueID=lcgrid.dnp.fmph.uniba.sk))
# requesting: ALL
#

# lcgrid.dnp.fmph.uniba.sk, FMPhI-UNIBA, local, grid
dn: GlueSEUniqueID=lcgrid.dnp.fmph.uniba.sk,Mds-Vo-name=FMPhI-UNIBA,Mds-Vo-nam
 e=local,o=grid
objectClass: GlueSETop
objectClass: GlueSE
objectClass: GlueInformationService
objectClass: GlueKey
objectClass: GlueSchemaVersion
GlueSEUniqueID: lcgrid.dnp.fmph.uniba.sk
GlueSEName: FMPhI-UNIBA:disk
GlueSEPort: 2811
GlueSESizeTotal: 0
GlueSESizeFree: 0
GlueSEArchitecture: disk
GlueInformationServiceURL: ldap://lcgrid.dnp.fmph.uniba.sk:2170/mds-vo-name=re
 source,o=grid
GlueForeignKey: GlueSiteUniqueID=FMPhI-UNIBA
GlueSchemaVersionMajor: 1
GlueSchemaVersionMinor: 2

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

SELECT * FROM GLUESE WHERE UNIQUEID = 'lcgrid.dnp.fmph.uniba.sk'
mysql> SELECT * FROM GLUESE WHERE UNIQUEID = 'lcgrid.dnp.fmph.uniba.sk';
+------+--------------------------+------------------+--------------+-----------+----------+------------------------------------------------------------------+------+---------------+--------------------+-----------------------+--------+-----------------+-------------------+----------------+------------------+------+
| ID   | UniqueID                 | Name             | Architecture | SizeTotal | SizeFree | InformationServiceURL                                            | Port | CurrentIOLoad | ImplementationName | ImplementationVersion | Status | TotalOnlineSize | TotalNearlineSize | UsedOnlineSize | UsedNearlineSize | Type |
+------+--------------------------+------------------+--------------+-----------+----------+------------------------------------------------------------------+------+---------------+--------------------+-----------------------+--------+-----------------+-------------------+----------------+------------------+------+
| 7808 | lcgrid.dnp.fmph.uniba.sk | FMPhI-UNIBA:disk | disk         |         0 |        0 | ldap://lcgrid.dnp.fmph.uniba.sk:2170/mds-vo-name=resource,o=grid | 2811 |          NULL | NULL               | NULL                  | NULL   | NULL            | NULL              | NULL           | NULL             | NULL |
+------+--------------------------+------------------+--------------+-----------+----------+------------------------------------------------------------------+------+---------------+--------------------+-----------------------+--------+-----------------+-------------------+----------------+------------------+------+
1 row in set (0.00 sec)

Comparison_Full_MediumDataSize.png

Comparison_Full_BigDataSize.png

Conclusion

-- FelixNikolausEhm - 20 Sep 2007

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng ClientComparison.png r6 r5 r4 r3 r2 manage 43.8 K 2007-11-08 - 15:39 FelixNikolausEhm The Clients in Comparison using Oracle Database
PNGpng ClientComparison_MySQL.png r4 r3 r2 r1 manage 40.4 K 2007-11-08 - 14:51 FelixNikolausEhm The Clients in Comparison usinhg MySQL 5 Database
PNGpng ClientComparison_mixed.png r1 manage 35.9 K 2007-11-08 - 15:23 FelixNikolausEhm Overview on the results for all implementations
PNGpng Comparison_Full_BigDataSize.png r2 r1 manage 34.1 K 2007-12-11 - 10:35 FelixNikolausEhm  
PNGpng Comparison_Full_BigDataSizeFreq.png r2 r1 manage 38.8 K 2007-12-11 - 10:35 FelixNikolausEhm  
PNGpng Comparison_Full_MediumDataSize.png r9 r8 r7 r6 r5 manage 38.4 K 2007-12-11 - 11:09 FelixNikolausEhm Comparison graph ~8720 Entries
PNGpng Comparison_Full_MediumDataSizeFreq.png r4 r3 r2 r1 manage 37.7 K 2007-12-11 - 11:09 FelixNikolausEhm Comparison graph Medium Data Size by Frequency
PNGpng DBSchema.png r1 manage 144.8 K 2007-11-07 - 10:09 FelixNikolausEhm The full DB schema as a picture
PNGpng MULTI_LDAP_9C_10T_TimeThreads.png r2 r1 manage 14.0 K 2007-11-22 - 13:24 FelixNikolausEhm Response Time behaviour of the OpenLDAP DB
PNGpng MULTI_MYSQL4_9C_10T_IN_MEMORY_TimeThreads.png r1 manage 11.9 K 2007-11-22 - 16:15 FelixNikolausEhm Response Time behaviour of MySQL 4 DB (tables in MEMORY)
PNGpng MULTI_MYSQL4_9C_10T_TimeThreads.png r2 r1 manage 17.7 K 2007-11-22 - 13:25 FelixNikolausEhm Response Time behaviour of the MySQL 4 DB
PNGpng MULTI_ORACLE_9C_10T_TimeThreads.png r2 r1 manage 14.6 K 2007-11-22 - 13:25 FelixNikolausEhm Response Time behaviour of the Oracle DB
Microsoft Excel Spreadsheetxls RelationalDBTest.xls r3 r2 r1 manage 124.0 K 2007-12-11 - 17:07 FelixNikolausEhm Excel Sheet containing Numbers Graphs
Compressed Zip archivezip Test_Result_Logs.zip r1 manage 222.4 K 2007-12-06 - 10:35 FelixNikolausEhm Archive containing the test log files
PNGpng schema_simplified.png r1 manage 6.1 K 2007-10-01 - 16:07 FelixNikolausEhm ValueTable, TypeTable and GlueCE simplified schema example
Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2010-06-11 - PeterJones
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback