Why does the industry no longer need traditional ETL/ELT
Business critical decisions, future expansion plans, business investment and divestment decisions, and everything else require complex reports and massive amounts of data. The required data existed in diﬀerent silos and was not easily accessible, for visualization or analysis, without a whole lot of preparation and generating real-time reports was only a dream.
The complexities in data management quickly led to the evolution of ETL Systems for generating meaningful insights out of Big Data. An ETL system is broad collection of tools that collect data from diﬀerent sources and load it to data warehouses after transforming it the schema of the target data warehouse.
What enterprises need from their data is straight forward
Connect with any data
Don’t have right connectors, need to right complex custom logic
Shorten time to insights
My reports crawl, I can’t see my data, I need to wait for weeks to run my reports.
Query with Sql
I have no time to learn new technology, can’t ﬁnd the resource we need, too expensive to hire an expert
Eliminate data silos
I don’t know where to start, I can’t share my data, I need real-time data, I’m tired of data inconsistency
Challenge Traditional Way: ETL /ELT Process the bottleneck
More Detail Challenges of traditional ETL/ELT are deﬁne below
The evolution of modern data hub & logical data warehouse
90% Faster to implement Higher Performance No ETL/ELT programing is required Modern Cloud Compute Easier Modiﬁcation Zero Latency
Modern Data Governance & Data Catalog for enterprises
DataCollection is a systematic approach to collect and measure information from a variety of sources to get a complete and accurate picture of an area of interest. Data collection allows an organization answer relevant questions, evaluate results and make predictions about future trends and probabilities.
Correct and systematic collection of data is essential to maintain the integrity of information for decision-making based on data.There is much talk about Data Driven and other similar concepts,but the right approach is talk about “a culture of data.” We may collect thousands of mobile data applications, visiting websites,loyalty programs, and online surveys to meet clients better but for all, we need a system that allows us to manage all these data securely, applying data governance accurate and compliance with laws regulations.
We know that when we talk about Big Data voluminous amounts of structured, semistructured and unstructured data collected by organizations are described. But, because it takes a lot of time and money to load large data into a traditional relational database for analysis, new approaches to collecting and analyzing all this are emerging. We need to collect and then extract large data for information, raw data with extended metadata aggregating this into a Data Lake. From there, automatic learning and artiﬁcial intelligence programs will use sophisticated algorithms to search for repeatable patterns.
The problem arises when there is no kind of control and hierarchy when managing these authentic lake data that sometimes grow steadily without really provide value.
Data Collection vs. Entropy
Logical Data Warehouse intermediate layer
The LDW technology that resides in Lyftron allows a new approach and new ﬂexibility when it comes to managing Data Lake it is even possible to dispense with them due to the capacity of an LDW to contact the source directly in real time and without any intermediaries. Why build a data lake if we can develop and model directly using data sources?
Lyftron allows a two-way connection. Data ﬂows directly from diﬀerent sources to diﬀerent targets. What Lyftron does is transform the data sources into an SQL query. Generating views of all data sources and allowing these sources to be combined, joined, mixed and compared. And that’s not all; we’re going to be able to materialize these views within a Data Warehouse on-premise or in the cloud or even within Apache Spark (which may be the most powerful and economical Data Warehouse in the world even if it wasn’t born for this use).
The integration of new data sources is a lengthy and costly process requiring data modeling, ETL custom development work and complete regression testing.
Traditional data models are often biased rigid questions and are unable to accommodate dynamic and ad-hoc data analysis processes. Unstructured data and semistructured cannot be easily integrated. For this reason, the new technology that drive Lyftron has born: The Logical Data Warehouse
Tags good stuﬀ for data collection
Lyftron brings you the possibility to have a data classiﬁcation and to build a data catalog with tags that enables simple data discovery and avoids repeated collection of the same data. Here are some other features that make Lyftron the perfect tool for Data Classiﬁcation:
With Lyftron, access to the data collected is instantaneous and real-time. LDW technology eliminates the need to copy and move data. It will be possible optionally if we need to correct the data or link it to other sources. In this way, any customer data can be linked to a customer proﬁle in a CRM.
The characteristics of Lyftron allow a data uniﬁcation in a single format that can be used by any tool. Lyftron transforms all data into a SQL Query. You’ll be able to have only Data format anytime without ETL process.
For governance, Lyftron provides a single and uniﬁed security model with access rights, dynamic row-level security, and data masking, and, most importantly, the GDPR-compliance.
Data collection & self service BI
Collecting data with Lyftron
Data lake or logical data warehouse?
We see here the characteristics of Lyftron LDW:
Provide a logical data warehouse with a columnar high performance data engine
Emulates engine database most popular and successful on a protocol level (SQL) enterprise-grade, enabling broad adoption and fast without making additional changes to existing tools and infrastructure,
Provides a pure logical data warehouse tool data is fully integrated with Microsoft technology (can be a perfect tool to migrate diﬀerent sources or other databases to Azure)
Beneﬁts of Lyftron modern data hub advent of cloud based technologies
Cloud-based technologies have completely transformed the modern enterprise landscape. Enterprises have shifted their focus from long analytical cycles to self-serving applications by adopting cloud technologies. Enterprise level SaaS has gained a wide impetus because of the ﬂexibility and agility it oﬀers over traditional methodologies.
Enterprises can now harness the power of large volumes of data, structured, semi-structured and unstructured, to generate meaningful business insights and focus on adding more value to their customers
Real time data for decision support
Cost eﬀective solutions
Storage and maintenance costs turned out to be major concern with legacy systems, however with the advent of cloud data warehouse services like Amazon RedShift and Snowﬂake these raising costs are no longer a concern. The modern cloud data warehouse services are highly scalable in nature and can automatically scale up or down basing on the surge in the data. They also oﬀer pay as you go services and charge basing on usage
Access to all data points
One of the biggest advantages of modern data pipe lines is to gain access to all your data points and gain a holistic view of your business. Modern data pipe lines help you to bring all your data together, without any technical limitations, making analysis simple for all stakeholders.
Comparison of some leading ETL/ELT tools
Modern enterprises require solutions that are highly ﬂexible and agile in nature, however the legacy ETL tools failed to address their concerns. Here is a quick comparison of some of the leading ETL tools:
Limited support to modern data platforms
Does not support modern data platforms
Does not support modern data platforms
No change data capture facilities and must rely on the database for the CDC facility. Limited scalability. Severe performance impact while transforming large data sets and require additional resources for transforming data rich in variety. No automatic scheduling facility. Scheduling must happen manually or through scripts.
BI users can prototype data sets for analytics on real-time data and replicate the data to Modern data warehouse once the dashboards are ﬁnalized.
Iterative migration of a legacy data warehouse to modern cloud data warehouse may be started in 5 days
Whole database migration (lyft-and-shift approach) is possible in 10 days, independent of the database size (number of tables)
Large table scan queries on modern cloud data warehouse may be 1000x faster
When Lyftron is used as an SQL proxy that translates SQL queries on the ﬂy. This feature is used when a cube is deﬁned in Lyftron. Aggregated queries with GROUP BY and joins are rewritten on the ﬂy to use smaller, preaggregated materialized views. Lyftron maintains materialized views for frequently queried combinations of joins and grouping conditions. An example: the main dashboard may always show a graph with the revenue per business unit and such query would have to perform a full table scan for a huge fact table whenever a user opens the main dashboards. With Lyftron, it is possible to deﬁne a preaggregated view with a GROUP BY bu_id that has only 20 rows. Those queries will execute instantly because they are redirected to a smaller materialized view.
Modern cloud database is usable from ALL BI tools
Lyftron works as an SQL proxy (a semantic data layer) between BI tools and Modern data warehouse. Lyftron is wire compatible with Microsoft SQL Server and uses Transact-SQL dialect. MSSQL drivers are widely supported by all BI tools on the market. Business users could even connect directly from Microsoft Excel to Modern data warehouse through Lyftron, without installing any ODBC drivers, because SQL Server drivers are already preinstalled on Windows.
Windows Active Directory authentication for modern cloud data wareouse possible from any BI tool
Lyftron SQL proxy interface is compatible with Microsoft SQL Server and beneﬁts from the heritage of the Windows ecosystem. Lyftron may be installed on a computer that is integrated with the customer’s Active Directory domain. Lyftron will accept connections from ODBC,JDBC and ADO.NET drivers using Active Directory SSO authentication and translate Windows credentials to eﬀective roles to access Modern data warehouse. As a result, the customer does not need to manage user logins and passwords to enable access to a Modern data warehouse for business users.