SQL Dialects

SQL dialect configuration requires administrative privileges and is available in web GUI under Administer / SQL Dialects. You can use the SQL dialect configuration to tweak the existing dialect definitions or create a new dialect.

Data sources we connect to support different SQL languages, for example, Oracle provider supports PL/SQL that differs in many areas from TSQL that Lyftron supports. Because of this differences, Lyftron needs to be taught of how to convert queries from one SQL language dialect to another. SQL Dialects provide the functionality to map TSQL language elements between Lyftron's SQL and provider SQL.

While Lyftron has a lot of dialects configured out-of-the-box, there are ADO.net, JDBC or ODBC drivers that are less popular and are not available in Lyftron. Users might also decide not to use a specific provider functionality or just to do things differently.

For example, provider SQL can be missing an aggregate function like SUM(). In such case, Lyftron has to provide the implementation of SUM(). Even if function parity exists, thanks to the mapping configuration, we can decide what SQL functionality we'd like to push-down to the provider.

If the feature-set of a provider SQL is different than TSQL, Lyftron fills the gaps using its in-memory function implementations or using a default federator.

SQL dialect in action

Let's assume that we have connected Lyftron to MySql and use a single table called Shop.Sales. We want to execute in Lyftron the following SQL statement, which gives us a sum of all sales:

    SELECT SUM(Total)
      FROM Shop.Sales

Let's assume that MySql dialect has a defined mapping of the SUM() function. The resulting SQL executed in MySql would look like this:

    SELECT SUM(`Total`)
      FROM `Shop`.`Sales`

Notice that the SQL statement was rewritten to SQL format of MySql database server, along with the SUM() function. It means that the whole statement will execute in one step in MySql database and only results of that query will be transferred from MySql to Lyftron.

Now let's assume that MySql dialect has not defined the mapping of the SUM() function. In such case, the execution of the statement will be different. Due to lack of SUM() function mapping, the execution plan will be a two-step. The first step is to get the required data from MySql and transfer it to Lyftron for processing of SUM() function.

The first SQL statement will look like that:

    SELECT `Total`  /* notice the lack of SUM function */
      FROM `Shop`.`Sales`

The second step will consist of processing the data obtained in the first step, using a default federator - usually the built-in Apache Spark.

The SQL statement will be similar to the following:

    SELECT sum('Total')
      FROM <temporary_object_name_holding_data_from_step_one>

SQL dialect configuration is an essential part of the query execution and optimization engine. It enables a lot of flexibility in many scenarios, such as:

  • Configuring a new provider without coding, even for less popular exotic data sources that Lyftron vendor does not have access or built-in support.
  • Configuring a dialect that does not depend on a given technology (ADO.net, JDBC, ODBC) or dialect for each technology.
  • Control over optimizer behavior of what SQL features should be pushed down to the provider. For example, a given functionality in provider may be buggy or perform poorly. In such case, we can disable it and rely on a better option provided by Lyftron.

Features of SQL dialects

Lyftron supports many data-access technologies, such as ADO.Net, JDBC, ODBC, REST, and others. Each technology has its requirements, may support the same features in different ways, or support features not present anywhere else.

To find the common denominator, we introduced the following features to SQL dialects:

  • Connection string definition customization,
  • Schema analyzers with configuration,
  • Query features configuration,
  • SQL features configuration,
  • Type mappings,
  • Function mappings.

We will describe each of them in turn.

Connection definition customization

Connection definition provides means to set a connection string template, which will be used by a new connection wizard. Identifier quote character is also configurable.

Query features configuration

Query feature name Description
CROSS JOIN True if data provider's SQL supports CROSS JOIN clause.
EXCEPT True if data provider's SQL supports EXCEPT operator.
FROM with a subquery True if data provider's SQL FROM clause supports subqueries.
FULL JOIN True if data provider's SQL supports a FULL JOIN operator.
GROUP BY True if data provider's SQL supports GROUP BY clause.
GROUP BY CUBE True if data provider's SQL supports GROUP BY CUBE clause.
GROUP BY GROUPING SETS True if data provider's SQL supports GROUP BY GROUPING SETS clause.
GROUP BY ROLLUP True if data provider's SQL supports GROUP BY ROLLUP clause.
HAVING True if data provider's SQL supports HAVING clause.
INNER JOIN True if data provider's SQL supports INNER JOIN operator.
INTERSECT True if data provider's SQL supports INTERSECT operator.
LEFT JOIN True if data provider's SQL supports LEFT JOIN clause.
Multisource FROM True if data provider's SQL supports a multisource FROM clause.
ORDER BY True if data provider's SQL supports ORDER BY clause.
OVER True if data provider's SQL supports OVER clause.
Paging True if data provider's SQL supports paging.
Parametrized queries True if data provider's SQL supports parametrized queries.
PARTITION BY inside OVER True if data provider's SQL supports PARTITION BY inside an OVER clause.
RIGHT JOIN True if data provider's SQL supports RIGHT JOIN clause.
SELECT...INTO True if data provider's SQL supports SELECT...INTO clause.
TOP True if data provider's SQL supports TOP clause and which variant: TOP(<n>), LIMIT <n> or ROWS <n>. To allow TOP clause to be pushed-down to a source provider Literal value and Column reference SQL features must be enabled.
UNION True if data provider's SQL supports UNION operator.
UNION ALL True if data provider's SQL supports UNION ALL operator.

SQL features configuration

SQL features configuration section allows you to map SQL from source to SQL features in Lyftron.

SQL feature name Operator Description
Arithmetic: And & True if data provider's SQL supports arithmetic AND operator.
Arithmetic: Divide / True if data provider's SQL supports arithmetic DIVISION operator.
Arithmetic: Subtract - True if data provider's SQL supports arithmetic SUBTRACTION and unary NEGATIVE operators. The unary NEGATIVE operator can also be used to perform arithmetic operations on datetime and smalldatetime values.
Arithmetic: Modulo % True if data provider's SQL supports arithmetic MODULUS operator.
Arithmetic: Multiply * True if data provider's SQL supports arithmetic MULTIPLICATION operator.
Arithmetic: Not ~ True if data provider's SQL supports arithmetic NEGATION operator.
Arithmetic: Or | True if data provider's SQL supports arithmetic OR operator.
Arithmetic: Plus + True if data provider's SQL supports arithmetic ADDITION and UNARY POSITIVE operator. The unary POSITIVE operator can also be used to perform arithmetic operations on datetime and smalldatetime values.
Arithmetic: Xor ^ True if data provider's SQL supports bitwise XOR operator.
Bitwise NOT expr. ~ True if data provider's SQL supports bitwise NOT operator.
CAST True if data provider's SQL supports a CAST ( expression AS data_type [ ( length ) ] ) expression.
Column reference True if data provider's SQL supports column reference in expression.
Comparison: Equal = True if data provider's SQL supports comparison EQUAL operator.
Comparison: Greater > True if data provider's SQL supports comparison GREATER THAN operator.
Comparison: Greater or equal >= True if data provider's SQL supports comparison GREATER THAN OR EQUAL operator.
Comparison: Less < True if data provider's SQL supports comparison LESS THAN operator.
Comparison: Less or equal <= True if data provider's SQL supports comparison LESS OR EQUAL THAN operator.
Comparison: Not equal <> True if data provider's SQL supports comparison NOT EQUAL operator.
Concatenate strings + True if data provider's SQL supports a string concatenation using + operator.
CONVERT True if data provider's SQL supports a CONVERT ( data_type [ ( length ) ] , expression [ , style ] ) expression.
Date and time predefined value expr. True if data provider's SQL supports a predefined date value like in "DATEPART(month, $1)".
Function call True if data provider's SQL supports functions.
Literal value True if data provider's SQL supports literals.
Literal value in projection True if data provider's SQL supports literals in projection.
Negative expr. True if data provider's SQL supports unary negative operator.
Parameter reference True if data provider's SQL supports parameter references in expressions.
Predicate: AND True if data provider's SQL supports AND predicate.
Predicate: BETWEEN True if data provider's SQL supports BETWEEN predicate.
Predicate: EXISTS True if data provider's SQL supports EXISTS predicate.
Predicate: IN (expression) True if data provider's SQL supports a composite predicate IN <values>.
Predicate: IN (subquery) True if data provider's SQL supports a composite predicate IN (subquery).
Predicate: IS NULL True if data provider's SQL supports IS NULL predicate.
Predicate: LIKE True if data provider's SQL supports LIKE predicate.
Predicate: NOT True if data provider's SQL supports NOT predicate.
Predicate: OR True if data provider's SQL supports OR predicate.
Searched CASE True if data provider's SQL supports a Searched 'CASE' expression, for example CASE WHEN case_operand = when_operand1 THEN result1 WHEN case_operand = when_operand2 THEN result2 ... END.
SELECT scalar expr. True if data provider's SQL supports a select scalar expression, which is a nested SELECT statement that returns one column and one row.
Simple CASE True if data provider's SQL supports a simple 'CASE' expression.

Microsoft documentation

Type mappings

Type mappings section allows you to map data types from source to data types in Lyftron.

SQL type mapping column name Scope Description
Lyftron type Lyftron Data type in Lyftron, read only. Type system is compatible with Microsoft SQL Server 2012.
Target storage type Lyftron Data storage type in Lyftron.
Cast as Provider Data type that will be used to cast values to when preparing a query for provider.
Provider import types (CSV) Provider Comma-separated list of provider type names, including aliases.
Variable length Lyftron True if a provider data type has variable length.
Variable precision Lyftron True if a provider data type has variable precision.
Variable scale Lyftron True if a provider data type has variable scale.

For example, Teradata provider reports its supported binary data types are VARBYTE and BLOB. We store such data in Lyftron as VarBinary type. When we format a query to Teradata that uses any of types (VARBYTE,BLOB) we cast it to VARBYTE. The type mappings for the scanario above look like this:

SQL type mapping column name Value
Lyftron type VarBinary
Target storage type VarBinary
Cast as VARBYTE
Provider import types (CSV) VARBYTE,BLOB
Variable length True
Variable precision False
Variable scale False

Function mappings

Function mappings section allows you to map provider functions to Lyftron functions. Left table column represents the Lyftron function declaration and is read-only. Right table column holds the function declaration supported by provider and is user-editable.

To map a function declaration find its name in the left column and enter the provider function declaration to the right column. Function parameters are represented by a dollar sign followed by one-based parameter ordinal number, for example, $3 represents the third parameter of Lyftron's function.

SQL Server 2012 built-in functions on Microsoft TechNet.

Example, representative variants of function mappings:

Function name Mapping to provider function Comment
CEILING($1) CEIL($1) Lyftron function CEILING(parameter) has a different name in provider and maps to CEIL(parameter).
CHARINDEX($1,??$2) CHARINDEX($2, $1) Lyftron function declaration maps to the same function name, but with reversed parameters ($1,$2)->($2,$1).
COUNT($1) COUNT($1) Lyftron function declaration maps to exactly the same function in provider.
COUNT_BIG($1) COUNT_BIG($1) function is not supported by provider and will be simulated by Lyftron.
YEAR($1) DATEPART(year, $1) YEAR($1) function does not directly exist in provider, but is substituted with DATEPART(year, $1).
QUOTENAME($1) CONCAT (''', REPLACE($1, ''', CONCAT(''',''')), ''') QUOTENAME($1) function does not exist in provider, but can be simulated with a compound expression.

Date/time function mappings

Date/time function mappings section allows you to map all variants of the following functions:

Example variants of function mappings:

Function name Mapping to provider function Comment
DATEADD(day, $1, $2) DATE_ADD($2, INTERVAL $1 day) Lyftron function variant DATEADD(day, $1, $2) mapped to provider function DATE_ADD($2, INTERVAL $1 day).
DATEADD(nanosecond, $1, $2) NANOSECOND($1) Lyftron function variant DATEADD(nanosecond, $1, $2) mapped to provider function NANOSECOND($1).

Schema analyzers configuration

Schema analyzers are built-in, configurable adapters that read metadata from a specific provider and perform necessary steps to make the metadata importable to Lyftron.

We supported the following analyzers:

  • Custom with restrictions,
  • Custom with Select,
  • JDBC.

Standard settings for all providers are:

Setting Name Type of value Description
Length field name text Column name in metadata table returned by the provider that holds length information. Usually a COLUMN_SIZE, COLUMNSIZE, CHARACTER_MAXIMUM_LENGTH.
Name field name text Column name in metadata table returned by the provider that holds column name. Usually a COLUMN_NAME.
Nullability check field return type list of values Type of value returned by the provider in the column designated in 'Nullable field name' setting. Supported values are Bool, Int, Text:Yes/No, Text:Y/N.
Nullable field name text Column name in metadata table returned by the provider that holds column nullability. Usually a NULLABLE, IS_NULLABLE.
Precision field name text Column name in metadata table returned by the provider that holds precision. Usually a NUM_PREC_RADIX, NUMERIC_PRECISION.
Scale field name text Column name in metadata table returned by the provider that holds scale. Usually a DECIMAL_DIGITS, NUMERIC_SCALE.
Type field name text Column name in metadata table returned by the provider that holds type name. Usually a TYPE_NAME, DATA_TYPE.
Field type retrieved by list of values Designates a method to extract type information from column designated in 'Type field name'. Supported methods are Odbc code, Type.
Table's metadata loading method list of values Supported methods are Simple object name restriction, Fully qualified object name restriction.
Table list loading method list of values Supported methods are With restrictions, Without restrictions.
Custom query for table's metadata text Query to use to extract table's metadata from the provider.
Divide Unicode column length by 2 true/false True when column length returned from provider holds the length doubled for Unicode characters.
Use database name in restrictions true/false True when database name has to be used in restrictions. False otherwise.
Use schema name in restrictions true/false True when schema name has to be used in restrictions. False otherwise.
Use table name in restrictions true/false True when table name has to be used in restrictions. False otherwise.