SQL Dialects

SQL dialect configuration requires administrative privileges and is available in web GUI under Administer / SQL Dialects. You can use the SQL dialect configuration to tweak the existing dialect definitions or create a new dialect.

Data sources we connect to support different SQL languages, for example, Oracle provider supports PL/SQL that differs in many areas from TSQL that Lyftron supports. Because of this differences, Lyftron needs to be taught of how to convert queries from one SQL language dialect to another. SQL Dialects provide the functionality to map TSQL language elements between Lyftron's SQL and provider SQL.

While Lyftron has a lot of dialects configured out-of-the-box, there are ADO.net, JDBC or ODBC drivers that are less popular and are not available in Lyftron. Users might also decide not to use a specific provider functionality or just to do things differently.

For example, provider SQL can be missing an aggregate function like SUM(). In such case, Lyftron has to provide the implementation of SUM(). Even if function parity exists, thanks to the mapping configuration, we can decide what SQL functionality we'd like to push-down to the provider.

If the feature-set of a provider SQL is different than TSQL, Lyftron fills the gaps using its in-memory function implementations or using a default federator.

SQL dialect in action

Let's assume that we have connected Lyftron to MySql and use a single table called Shop.Sales. We want to execute in Lyftron the following SQL statement, which gives us a sum of all sales:

    SELECT SUM(Total)
      FROM Shop.Sales

Let's assume that MySql dialect has a defined mapping of the SUM() function. The resulting SQL executed in MySql would look like this:

    SELECT SUM(`Total`)
      FROM `Shop`.`Sales`

Notice that the SQL statement was rewritten to SQL format of MySql database server, along with the SUM() function. It means that the whole statement will execute in one step in MySql database and only results of that query will be transferred from MySql to Lyftron.

Now let's assume that MySql dialect has not defined the mapping of the SUM() function. In such case, the execution of the statement will be different. Due to lack of SUM() function mapping, the execution plan will be a two-step. The first step is to get the required data from MySql and transfer it to Lyftron for processing of SUM() function.

The first SQL statement will look like that:

    SELECT `Total`  /* notice the lack of SUM function */
      FROM `Shop`.`Sales`

The second step will consist of processing the data obtained in the first step, using a default federator - usually the built-in Apache Spark.

The SQL statement will be similar to the following:

    SELECT sum('Total')
      FROM <temporary_object_name_holding_data_from_step_one>

SQL dialect configuration is an essential part of the query execution and optimization engine. It enables a lot of flexibility in many scenarios, such as:

Configuring a new provider without coding, even for less popular exotic data sources that Lyftron vendor does not have access or built-in support.
Configuring a dialect that does not depend on a given technology (ADO.net, JDBC, ODBC) or dialect for each technology.
Control over optimizer behavior of what SQL features should be pushed down to the provider. For example, a given functionality in provider may be buggy or perform poorly. In such case, we can disable it and rely on a better option provided by Lyftron.

Features of SQL dialects

Lyftron supports many data-access technologies, such as ADO.Net, JDBC, ODBC, REST, and others. Each technology has its requirements, may support the same features in different ways, or support features not present anywhere else.

To find the common denominator, we introduced the following features to SQL dialects:

Connection string definition customization,
Schema analyzers with configuration,
Query features configuration,
SQL features configuration,
Type mappings,
Function mappings.

We will describe each of them in turn.

Connection definition customization

Connection definition provides means to set a connection string template, which will be used by a new connection wizard. Identifier quote character is also configurable.

Query features configuration

Query feature name	Description
CROSS JOIN	True if data provider's SQL supports `CROSS JOIN` clause.
EXCEPT	True if data provider's SQL supports `EXCEPT` operator.
FROM with a subquery	True if data provider's SQL `FROM` clause supports subqueries.
FULL JOIN	True if data provider's SQL supports a `FULL JOIN` operator.
GROUP BY	True if data provider's SQL supports `GROUP BY` clause.
GROUP BY CUBE	True if data provider's SQL supports `GROUP BY CUBE` clause.
GROUP BY GROUPING SETS	True if data provider's SQL supports `GROUP BY GROUPING SETS` clause.
GROUP BY ROLLUP	True if data provider's SQL supports `GROUP BY ROLLUP` clause.
HAVING	True if data provider's SQL supports `HAVING` clause.
INNER JOIN	True if data provider's SQL supports `INNER JOIN` operator.
INTERSECT	True if data provider's SQL supports `INTERSECT` operator.
LEFT JOIN	True if data provider's SQL supports `LEFT JOIN` clause.
Multisource FROM	True if data provider's SQL supports a multisource `FROM` clause.
ORDER BY	True if data provider's SQL supports `ORDER BY` clause.
OVER	True if data provider's SQL supports `OVER` clause.
Paging	True if data provider's SQL supports paging.
Parametrized queries	True if data provider's SQL supports parametrized queries.
PARTITION BY inside OVER	True if data provider's SQL supports `PARTITION BY` inside an `OVER` clause.
RIGHT JOIN	True if data provider's SQL supports `RIGHT JOIN` clause.
SELECT...INTO	True if data provider's SQL supports `SELECT...INTO` clause.
TOP	True if data provider's SQL supports `TOP` clause and which variant: `TOP(<n>)`, `LIMIT <n>` or `ROWS <n>`. To allow `TOP` clause to be pushed-down to a source provider `Literal value` and `Column reference` SQL features must be enabled.
UNION	True if data provider's SQL supports `UNION` operator.
UNION ALL	True if data provider's SQL supports `UNION ALL` operator.

SQL features configuration

SQL features configuration section allows you to map SQL from source to SQL features in Lyftron.

SQL feature name	Operator	Description
Arithmetic: And	&	True if data provider's SQL supports arithmetic AND operator.
Arithmetic: Divide	/	True if data provider's SQL supports arithmetic DIVISION operator.
Arithmetic: Subtract	-	True if data provider's SQL supports arithmetic SUBTRACTION and unary NEGATIVE operators. The unary NEGATIVE operator can also be used to perform arithmetic operations on datetime and smalldatetime values.
Arithmetic: Modulo	%	True if data provider's SQL supports arithmetic MODULUS operator.
Arithmetic: Multiply	*	True if data provider's SQL supports arithmetic MULTIPLICATION operator.
Arithmetic: Not	~	True if data provider's SQL supports arithmetic NEGATION operator.
Arithmetic: Or	\|	True if data provider's SQL supports arithmetic OR operator.
Arithmetic: Plus	+	True if data provider's SQL supports arithmetic ADDITION and UNARY POSITIVE operator. The unary POSITIVE operator can also be used to perform arithmetic operations on datetime and smalldatetime values.
Arithmetic: Xor	^	True if data provider's SQL supports bitwise XOR operator.
Bitwise NOT expr.	~	True if data provider's SQL supports bitwise NOT operator.
CAST		True if data provider's SQL supports a `CAST ( expression AS data_type [ ( length ) ] )` expression.
Column reference		True if data provider's SQL supports column reference in expression.
Comparison: Equal	=	True if data provider's SQL supports comparison EQUAL operator.
Comparison: Greater	>	True if data provider's SQL supports comparison GREATER THAN operator.
Comparison: Greater or equal	>=	True if data provider's SQL supports comparison GREATER THAN OR EQUAL operator.
Comparison: Less	<	True if data provider's SQL supports comparison LESS THAN operator.
Comparison: Less or equal	<=	True if data provider's SQL supports comparison LESS OR EQUAL THAN operator.
Comparison: Not equal	<>	True if data provider's SQL supports comparison NOT EQUAL operator.
Concatenate strings	+	True if data provider's SQL supports a string concatenation using `+` operator.
CONVERT		True if data provider's SQL supports a `CONVERT ( data_type [ ( length ) ] , expression [ , style ] )` expression.
Date and time predefined value expr.		True if data provider's SQL supports a predefined date value like in "DATEPART(`month`, $1)".
Function call		True if data provider's SQL supports functions.
Literal value		True if data provider's SQL supports literals.
Literal value in projection		True if data provider's SQL supports literals in projection.
Negative expr.		True if data provider's SQL supports unary negative operator.
Parameter reference		True if data provider's SQL supports parameter references in expressions.
Predicate: AND		True if data provider's SQL supports `AND` predicate.
Predicate: BETWEEN		True if data provider's SQL supports `BETWEEN` predicate.
Predicate: EXISTS		True if data provider's SQL supports `EXISTS` predicate.
Predicate: IN (expression)		True if data provider's SQL supports a composite predicate `IN <values>`.
Predicate: IN (subquery)		True if data provider's SQL supports a composite predicate `IN (subquery)`.
Predicate: IS NULL		True if data provider's SQL supports `IS NULL` predicate.
Predicate: LIKE		True if data provider's SQL supports `LIKE` predicate.
Predicate: NOT		True if data provider's SQL supports `NOT` predicate.
Predicate: OR		True if data provider's SQL supports `OR` predicate.
Searched CASE		True if data provider's SQL supports a Searched 'CASE' expression, for example `CASE WHEN case_operand = when_operand1 THEN result1 WHEN case_operand = when_operand2 THEN result2 ... END`.
SELECT scalar expr.		True if data provider's SQL supports a select scalar expression, which is a nested SELECT statement that returns one column and one row.
Simple CASE		True if data provider's SQL supports a simple 'CASE' expression.

Microsoft documentation

Type mappings

Type mappings section allows you to map data types from source to data types in Lyftron.

SQL type mapping column name	Scope	Description
Lyftron type	Lyftron	Data type in Lyftron, read only. Type system is compatible with Microsoft SQL Server 2012.
Target storage type	Lyftron	Data storage type in Lyftron.
Cast as	Provider	Data type that will be used to cast values to when preparing a query for provider.
Provider import types (CSV)	Provider	Comma-separated list of provider type names, including aliases.
Variable length	Lyftron	True if a provider data type has variable length.
Variable precision	Lyftron	True if a provider data type has variable precision.
Variable scale	Lyftron	True if a provider data type has variable scale.

For example, Teradata provider reports its supported binary data types are VARBYTE and BLOB. We store such data in Lyftron as VarBinary type. When we format a query to Teradata that uses any of types (VARBYTE,BLOB) we cast it to VARBYTE. The type mappings for the scanario above look like this:

SQL type mapping column name	Value
Lyftron type	VarBinary
Target storage type	VarBinary
Cast as	VARBYTE
Provider import types (CSV)	VARBYTE,BLOB
Variable length	True
Variable precision	False
Variable scale	False

Function mappings

Function mappings section allows you to map provider functions to Lyftron functions. Left table column represents the Lyftron function declaration and is read-only. Right table column holds the function declaration supported by provider and is user-editable.

To map a function declaration find its name in the left column and enter the provider function declaration to the right column. Function parameters are represented by a dollar sign followed by one-based parameter ordinal number, for example, $3 represents the third parameter of Lyftron's function.

SQL Server 2012 built-in functions on Microsoft TechNet.

Example, representative variants of function mappings:

Function name	Mapping to provider function	Comment
CEILING($1)	CEIL($1)	Lyftron function `CEILING(parameter)` has a different name in provider and maps to `CEIL(parameter)`.
CHARINDEX($1,??$2)	CHARINDEX($2, $1)	Lyftron function declaration maps to the same function name, but with reversed parameters ($1,$2)->($2,$1).
COUNT($1)	COUNT($1)	Lyftron function declaration maps to exactly the same function in provider.
COUNT_BIG($1)		`COUNT_BIG($1)` function is not supported by provider and will be simulated by Lyftron.
YEAR($1)	DATEPART(year, $1)	`YEAR($1)` function does not directly exist in provider, but is substituted with `DATEPART(year, $1)`.
QUOTENAME($1)	CONCAT (''', REPLACE($1, ''', CONCAT(''',''')), ''')	`QUOTENAME($1)` function does not exist in provider, but can be simulated with a compound expression.

Date/time function mappings

Date/time function mappings section allows you to map all variants of the following functions:

Example variants of function mappings:

Function name	Mapping to provider function	Comment
DATEADD(day, $1, $2)	DATE_ADD($2, INTERVAL $1 day)	Lyftron function variant `DATEADD(day, $1, $2)` mapped to provider function `DATE_ADD($2, INTERVAL $1 day)`.
DATEADD(nanosecond, $1, $2)	NANOSECOND($1)	Lyftron function variant `DATEADD(nanosecond, $1, $2)` mapped to provider function `NANOSECOND($1)`.

Schema analyzers configuration

Schema analyzers are built-in, configurable adapters that read metadata from a specific provider and perform necessary steps to make the metadata importable to Lyftron.

We supported the following analyzers:

Custom with restrictions,
Custom with Select,
JDBC.

Standard settings for all providers are:

Setting Name	Type of value	Description
Length field name	text	Column name in metadata table returned by the provider that holds length information. Usually a COLUMN_SIZE, COLUMNSIZE, CHARACTER_MAXIMUM_LENGTH.
Name field name	text	Column name in metadata table returned by the provider that holds column name. Usually a COLUMN_NAME.
Nullability check field return type	list of values	Type of value returned by the provider in the column designated in 'Nullable field name' setting. Supported values are Bool, Int, Text:Yes/No, Text:Y/N.
Nullable field name	text	Column name in metadata table returned by the provider that holds column nullability. Usually a NULLABLE, IS_NULLABLE.
Precision field name	text	Column name in metadata table returned by the provider that holds precision. Usually a NUM_PREC_RADIX, NUMERIC_PRECISION.
Scale field name	text	Column name in metadata table returned by the provider that holds scale. Usually a DECIMAL_DIGITS, NUMERIC_SCALE.
Type field name	text	Column name in metadata table returned by the provider that holds type name. Usually a TYPE_NAME, DATA_TYPE.
Field type retrieved by	list of values	Designates a method to extract type information from column designated in 'Type field name'. Supported methods are Odbc code, Type.
Table's metadata loading method	list of values	Supported methods are Simple object name restriction, Fully qualified object name restriction.
Table list loading method	list of values	Supported methods are With restrictions, Without restrictions.
Custom query for table's metadata	text	Query to use to extract table's metadata from the provider.
Divide Unicode column length by 2	true/false	True when column length returned from provider holds the length doubled for Unicode characters.
Use database name in restrictions	true/false	True when database name has to be used in restrictions. False otherwise.
Use schema name in restrictions	true/false	True when schema name has to be used in restrictions. False otherwise.
Use table name in restrictions	true/false	True when table name has to be used in restrictions. False otherwise.