CORAL mappings between SQL and C++ data types
This page is meant to describe what users should know about the internal mapping of
C++ data types and
SQL data types in CORAL. It covers questions that are often asked by end users in their support requests (such as
bug #69605
,
sr #114149
or
sr #109565
).
In particular, some issues in CORAL type conversion are pointed out and some recommendations given to users about dos and donts for their application. [For reference, the way these issues are handled at application-level inside COOL is also briefly described].
Introduction: C++ and SQL data types and CORAL type converters
CORAL supports a large number of C++ client platforms (32/64 bit, linux/osx/windows, gcc/vc7,,,) and several relational back-ends (Oracle, MySQL, SQLite...).
- In general, the same C++ data types exist and have the same meaning on all client platforms (with some important exceptions, e.g.
long
has a different precision on 32 and 64 bit platforms).
- The situation is quite different for SQL data types, where even the simplest numerical types are often best described by rather different SQL data types (e.g a 32 bit signed integer can be described by
NUMBER(10)
on Oracle, by INT(11)
on MySQL and by INT
on SQlite).
Each CORAL plugin (e.g. OracleAccess, MySQLAccess, SQLiteAccess) defines a different mapping between C++ and SQL data types though its own implementation of the
ITypeConverter
interface (see
doxygen
). The
ITypeConverter
interface, which can be retrieved from
ISessionProxy
, allows users to examine and modify the default mappings defined in each plugin.
Type converters are used in CORAL:
- when creating tables (C++ to SQL conversion);
- when filling tables, if the rowBuffer method described below is called (SQL to C++ conversion, from the SQL types of the stored table);
- when executing queries, if the defineOutput method desscribed below is not called (SQL to C++ conversion, from the SQL types appropriate to return the result set of the statement being processed).
Mapping C++ to SQL data types (creating a table)
When a relational table is created through the CORAL API, its schema must be specified through the
TableDescription
class. In particular, the
insertColumn
method should be used to specify the name and C++ data type of each column that should be created. In the following example, a column named 'ID' with C++ type 'int' is created:
// Create a table
coral::TableDescription description;
description.setName( "MYTABLE" );
description.insertColumn( "ID", coral::AttributeSpecification::typeNameForId( typeid(int) ) );
coral::ITable& table = schema.createTable( description );
Internally, each CORAL plugin is responsible for translating the C++ data type specified by the user into the appropriate SQL data type for the corresponding back-end, through the
sqlTypeForCppType
method of its own
ITypeConverter
implementation. For instance, the 'int' above is by default translated to
NUMBER(10)
in OracleAccess and to
INT
in SQLiteAccess.
Known issues in C++ to SQL mapping in CORAL (creating a table)
While it is very flexible and easily allows users to change the existing mappings, the mechanism described above has a few well known limitations, for instance:
- The same C++ code on the same backend may lead to the creation of tables with different SQL types on different client platforms. This may break the portability of the code if different client platforms are used to store and read data in the same database. For instance, if you create a 'long' column on a 32 bit client, the SQL data type chosen by the specific plugin may allocate storage space only for a 32 bit integer; if you then try from a 64 bit client to store a 'long' which exceeds 32 bit precision, this operation will fail. [This is one of the main reasons why a different API was chosen for storing data in COOL, forcing users to specify at table creation time whether they plan to store a 32bit or 64bit integer type. For more details see the analysis in task #2871
and bug #14671
. For more recent examples see bug #67512
; another bug #87155
may also be related to this issue.]
Mapping SQL to C++ data types (filling a table)
When a relational table is filled through the CORAL API, the values to be inserted in a new row must be passed as an
AttributeList
argument. Users may either use the CORAL
rowBuffer
method to determine the name and C++ types of each
Attribute
holding a value to be inserted into a new column, or they may pass the values for the new row as an
AttributeList
they have preallocated with their own preferred C++ data types:
coral::ITable& table = session->nominalSchema().tableHandle("MYTABLE");
// Insert a new row (using rowBuffer)
coral::AttributeList buffer1;
table.dataEditor().rowBuffer( buffer1 );
buffer1["ID"].data<int>() = 1;
table.dataEditor().insertRow( buffer1 );
// Insert a new row (using a custom buffer)
coral::AttributeList buffer2;
buffer2.extend( "ID", typeid(int) );
buffer2["ID"].data<int>() = 1;
table.dataEditor().insertRow( buffer2 );
Known issues in SQL to C++ mapping in CORAL (filling a table)
- Using the ITableDataEditor::rowBuffer method to insert table rows is strongly discouraged! This mechanism has major well-known limitations which may eventually lead to its removal from the CORAL API. For some C++ data types it is simply impossible to insert data into a table using the AttributeList created by
rowBuffer
. Users should define their own custom buffers instead. A classical example (see bug #71410
) is the 'unsigned int' type. A table description with two columns of 'int' and 'unsigned int' type will result in the creation of an Oracle table with two columns of the same SQL type NUMBER(10)
. If rowBuffer
is used, CORAL will try to use the signed 'int' C++ type to insert values into both columns. [This issue is handled differently in COOL, where a limited set of storage types is uniquely associated to both SQL and C++ data types and the storage type used for each payload column is stored persistently in a table of the COOL database, so that each database column can be uniquely mapped back to a C++ type.]
- The CORAL AttributeList class does not internally cross check if a value inserted into a column table exceeds the storage capacity requested at creation time. This is relevant for instance for large objects, for which different backends define SQL types with different maximum capacities, because it may break data copies across different backends. For instance, if you need to store strings of up to 10k characters, OracleAccess will allocate a
CLOB
type (as such strings do not fit in a VARCHAR2(4000)
) and MySQLAccess a TEXT
type. If you try to store a much longer string with 100k characters, CORAL will allow you to store it into Oracle (because a CLOB is enough). If you try to read and copy the data to MySQL, however, this operation will fail because a MEDIUMTEXT
would be required. [This issue is also handled differently by COOL, where the IField::setValue
method internally ensures that a C++ value is within the allowed range for the persistent storage type associated to it.]
Mapping SQL to C++ data types (querying a table)
When a relational table is queried through the CORAL API, users may either let CORAL determine the name and C++ types of each
Attribute
representing a table column in the query result set, or they may force CORAL to retrieve the result set values into an
AttributeList
they have preallocated with their own preferred C++ data types. In the latter case, the difference is
that users call the
defineOutput
method of the
IQuery
before executing it:
coral::ITable& table = session->nominalSchema().tableHandle("MYTABLE");
// Query a table (without defineOutput)
coral::IQuery* query1 = table.newQuery();
coral::ICursor& cursor1 = query1->execute();
// Query a table (using a custom buffer with defineOutput)
coral::IQuery* query2 = table.newQuery();
coral::AttributeList buffer;
buffer.extend( "ID", typeid(int) );
query2->defineOutput( buffer );
coral::ICursor& cursor2 = query2->execute();
Internally, if
defineOutput
is not called, each CORAL plugin is responsible for translating the SQL data type of the table in the database into the appropriate C++ data type for the corresponding back-end, through the
cppTypeForSqlType
method of its own
ITypeConverter
implementation. For instance, the
NUMBER(10)
column created in Oracle through the example above will by default be translated to an 'int' in OracleAccess.
Known issues in SQL to C++ mapping in CORAL (querying a table)
- Executing CORAL queries without calling IQuery::defineOutput is strongly discouraged! This mechanism has major well-known limitations which may eventually lead to its removal from the CORAL API. For some C++ data types it is simply impossible to execute a query without calling
defineOutput
. A classical example (see bug #71410
especially, or also bug #74867
) is the 'unsigned int' type. A table description with two columns of 'int' and 'unsigned int' type will result in the creation of an Oracle table with two columns of the same SQL type NUMBER(10)
. If defineOutput
is not called, CORAL will try to retrieve the values of both columns using the signed 'int' C++ type. [Inside COOL, all queries proceed through calls to defineOutput
. This is possible because the storage type used for each payload column is stored persistently in a table of the COOL database, as described above.]
--
AndreaValassi - 11-Aug-2010