CORAL mappings between SQL and C++ data types

This page is meant to describe what users should know about the internal mapping of C++ data types and SQL data types in CORAL. It covers questions that are often asked by end users in their support requests (such as bug #69605, sr #114149 or sr #109565).

In particular, some issues in CORAL type conversion are pointed out and some recommendations given to users about dos and donts for their application. [For reference, the way these issues are handled at application-level inside COOL is also briefly described].

Introduction: C++ and SQL data types and CORAL type converters

CORAL supports a large number of C++ client platforms (32/64 bit, linux/osx/windows, gcc/vc7,,,) and several relational back-ends (Oracle, MySQL, SQLite...).

  • In general, the same C++ data types exist and have the same meaning on all client platforms (with some important exceptions, e.g. long has a different precision on 32 and 64 bit platforms).
  • The situation is quite different for SQL data types, where even the simplest numerical types are often best described by rather different SQL data types (e.g a 32 bit signed integer can be described by NUMBER(10) on Oracle, by INT(11) on MySQL and by INT on SQlite).

Each CORAL plugin (e.g. OracleAccess, MySQLAccess, SQLiteAccess) defines a different mapping between C++ and SQL data types though its own implementation of the ITypeConverter interface (see doxygen). The ITypeConverter interface, which can be retrieved from ISessionProxy, allows users to examine and modify the default mappings defined in each plugin.

Type converters are used in CORAL:

  • when creating tables (C++ to SQL conversion);
  • when filling tables, if the rowBuffer method described below is called (SQL to C++ conversion, from the SQL types of the stored table);
  • when executing queries, if the defineOutput method desscribed below is not called (SQL to C++ conversion, from the SQL types appropriate to return the result set of the statement being processed).

Mapping C++ to SQL data types (creating a table)

When a relational table is created through the CORAL API, its schema must be specified through the TableDescription class. In particular, the insertColumn method should be used to specify the name and C++ data type of each column that should be created. In the following example, a column named 'ID' with C++ type 'int' is created:

    // Create a table
    coral::TableDescription description;
    description.setName( "MYTABLE" );
    description.insertColumn( "ID", coral::AttributeSpecification::typeNameForId( typeid(int) ) );
    coral::ITable& table = schema.createTable( description );

Internally, each CORAL plugin is responsible for translating the C++ data type specified by the user into the appropriate SQL data type for the corresponding back-end, through the sqlTypeForCppType method of its own ITypeConverter implementation. For instance, the 'int' above is by default translated to NUMBER(10) in OracleAccess and to INT in SQLiteAccess.

Known issues in C++ to SQL mapping in CORAL (creating a table)

While it is very flexible and easily allows users to change the existing mappings, the mechanism described above has a few well known limitations, for instance:

  1. The same C++ code on the same backend may lead to the creation of tables with different SQL types on different client platforms. This may break the portability of the code if different client platforms are used to store and read data in the same database. For instance, if you create a 'long' column on a 32 bit client, the SQL data type chosen by the specific plugin may allocate storage space only for a 32 bit integer; if you then try from a 64 bit client to store a 'long' which exceeds 32 bit precision, this operation will fail. [This is one of the main reasons why a different API was chosen for storing data in COOL, forcing users to specify at table creation time whether they plan to store a 32bit or 64bit integer type. For more details see the analysis in task #2871 and bug #14671. For more recent examples see bug #67512; another bug #87155 may also be related to this issue.]

Mapping SQL to C++ data types (filling a table)

When a relational table is filled through the CORAL API, the values to be inserted in a new row must be passed as an AttributeList argument. Users may either use the CORAL rowBuffer method to determine the name and C++ types of each Attribute holding a value to be inserted into a new column, or they may pass the values for the new row as an AttributeList they have preallocated with their own preferred C++ data types:

    coral::ITable& table = session->nominalSchema().tableHandle("MYTABLE");
    // Insert a new row (using rowBuffer)
    coral::AttributeList buffer1;
    table.dataEditor().rowBuffer( buffer1 );
    buffer1["ID"].data<int>() = 1;
    table.dataEditor().insertRow( buffer1 );
    // Insert a new row (using a custom buffer)
    coral::AttributeList buffer2; 
    buffer2.extend( "ID", typeid(int) ); 
    buffer2["ID"].data<int>() = 1;
    table.dataEditor().insertRow( buffer2 );

Known issues in SQL to C++ mapping in CORAL (filling a table)

  1. Using the ITableDataEditor::rowBuffer method to insert table rows is strongly discouraged! This mechanism has major well-known limitations which may eventually lead to its removal from the CORAL API. For some C++ data types it is simply impossible to insert data into a table using the AttributeList created by rowBuffer. Users should define their own custom buffers instead. A classical example (see bug #71410) is the 'unsigned int' type. A table description with two columns of 'int' and 'unsigned int' type will result in the creation of an Oracle table with two columns of the same SQL type NUMBER(10). If rowBuffer is used, CORAL will try to use the signed 'int' C++ type to insert values into both columns. [This issue is handled differently in COOL, where a limited set of storage types is uniquely associated to both SQL and C++ data types and the storage type used for each payload column is stored persistently in a table of the COOL database, so that each database column can be uniquely mapped back to a C++ type.]

  2. The CORAL AttributeList class does not internally cross check if a value inserted into a column table exceeds the storage capacity requested at creation time. This is relevant for instance for large objects, for which different backends define SQL types with different maximum capacities, because it may break data copies across different backends. For instance, if you need to store strings of up to 10k characters, OracleAccess will allocate a CLOB type (as such strings do not fit in a VARCHAR2(4000)) and MySQLAccess a TEXT type. If you try to store a much longer string with 100k characters, CORAL will allow you to store it into Oracle (because a CLOB is enough). If you try to read and copy the data to MySQL, however, this operation will fail because a MEDIUMTEXT would be required. [This issue is also handled differently by COOL, where the IField::setValue method internally ensures that a C++ value is within the allowed range for the persistent storage type associated to it.]

Mapping SQL to C++ data types (querying a table)

When a relational table is queried through the CORAL API, users may either let CORAL determine the name and C++ types of each Attribute representing a table column in the query result set, or they may force CORAL to retrieve the result set values into an AttributeList they have preallocated with their own preferred C++ data types. In the latter case, the difference is that users call the defineOutput method of the IQuery before executing it:

    coral::ITable& table = session->nominalSchema().tableHandle("MYTABLE");
    // Query a table (without defineOutput)
    coral::IQuery* query1 = table.newQuery();
    coral::ICursor& cursor1 = query1->execute();
    // Query a table (using a custom buffer with defineOutput)
    coral::IQuery* query2 = table.newQuery();
    coral::AttributeList buffer; 
    buffer.extend( "ID", typeid(int) ); 
    query2->defineOutput( buffer ); 
    coral::ICursor& cursor2 = query2->execute();

Internally, if defineOutput is not called, each CORAL plugin is responsible for translating the SQL data type of the table in the database into the appropriate C++ data type for the corresponding back-end, through the cppTypeForSqlType method of its own ITypeConverter implementation. For instance, the NUMBER(10) column created in Oracle through the example above will by default be translated to an 'int' in OracleAccess.

Known issues in SQL to C++ mapping in CORAL (querying a table)

  1. Executing CORAL queries without calling IQuery::defineOutput is strongly discouraged! This mechanism has major well-known limitations which may eventually lead to its removal from the CORAL API. For some C++ data types it is simply impossible to execute a query without calling defineOutput. A classical example (see bug #71410 especially, or also bug #74867) is the 'unsigned int' type. A table description with two columns of 'int' and 'unsigned int' type will result in the creation of an Oracle table with two columns of the same SQL type NUMBER(10). If defineOutput is not called, CORAL will try to retrieve the values of both columns using the signed 'int' C++ type. [Inside COOL, all queries proceed through calls to defineOutput. This is possible because the storage type used for each payload column is stored persistently in a table of the COOL database, as described above.]

-- AndreaValassi - 11-Aug-2010

Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r11 - 2011-10-05 - AndreaValassi
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Persistency All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback