Snowflake CI CD Pipeline using FlyWay and Azure DevOps

Javed Syed
August 10, 2022

Before building the Snowflake CI-CD Pipeline, wherein CI is the abbreviation used for Continuous Integration and CD is Continuous Delivery. Let’s understand what the term exactly means, why it is associated with the Snowflake CI-CD Pipeline, and what are the key contributions of Flyway and Azure DevOps.

Prerequisites
What is Snowflake?
- Features of Snowflake Cloud Data Warehouse
What is CI-CD Pipeline?
What is Azure DevOps?
What is FlyWay?
A four-step procedure for setting up Snowflake CI-CD using Azure DevOps and FlyWay?
Conclusion
FAQs

Prerequisites

Snowflake

Snowflake CI-CD Pipeline is the simplest and most convenient Cloud Data Warehouse minus the limitations related to the storage size. It is the most hassle-free and worthwhile choice for organizations that aspire to obtain a substantial cloud space with minimal cost of acquisition. Plus it does not need any special assistant for maintenance or installation and it favors the in-house server. You can compute the data in real-time and it will automatically store in the cloud data warehouse.

Features of Snowflake Cloud Data Warehouse

Snowflake CI-CD Pipeline is well-protected and secure: The myriad security elements like Dual-Factor Authentication and SSO via Federated Authentication make it safer and faith worthy for the masses. Snowflake CI-CD Pipeline offers you security along with real-time data accessibility which you can manage with the help of whitelisted IP addresses.

It emphasizes the expansion and supports data in real-time: The extensive multi-cluster database can be simplified by computing it through the Snowflake CI-CD Pipeline cloud data warehouse and storing the endless supply of data efficiently.

It favors semi-structured data: Snowflakes CI-CD Pipeline focused on primarily supporting the Semi-Structured and Structured Data in the same space by exercising the VARIANT schema by extracting the attributes and enabling it into the snowflakes read data type on the cloud data warehouse. The VARIANT schema is utilized in the same location where both structured and unstructured data are stored within the READ data type.

Does not require maintenance: The Snowflake CI-CD Pipeline cloud data warehouse is simplified and user-friendly and it does not require any expert guidance and assistance for setting up the installation, and future maintenance, which makes it cost-effective as well.

What is the CI-CD pipeline?

Snowflake CI (continuous integration) and CD (continuous delivery) tices where Continuous Delivery is the broad spectrum umbrella in which Continuous Integration is a component. The term Continuous Delivery pipeline is an extension of CI for Snowflake that enables the cloud database to automatically generate code changes to test and produce the environment to run the codes. Whereas, Continuous Integration is the integration with a Site Reliability Engineering or DevOps that helps the developers with the systematic merging of code changes for the automation of running the codes for building and testing.

What is Azure DevOps?

Azure DevOps is a platform introduced by Microsoft as a toolchain that offers the end-to-end SaaS (software as a service) interface for developing software in a much more efficient way. It will make developing and deploying much easier and faster, unlike the traditional software approaches. Azure DevOps is an excellent choice and an effective toolchain for blending smoothly with code development, application building and creation, and managing projects or developing software.

What is Flyway?

Flyway simplifier allows developers to acquire automated version-based databases under open-source licensing by Apache license 2.0. Flyway enables both Java code & SQL script for the essential system update operation. It can be most effective for detecting and correcting operation errors and essential up-gradation. Flyway for Snowflake CI-CD pipeline is used as a migration tool that supports both private and public key authentication. Flyway is used for Snowflake CI-CD pipelines for easy migration of data, cleaning trash, collecting necessary information, providing validation, rewinding the errors, baselining, and assuring the essential repairs for the cloud database warehouse.

Supercharge Snowflake ETL and Analysis using Lyftrondata’s low or NO-CODE data Pipeline

Lyftrondata supports 300+ Integrations to SaaS platforms. Lyftrondata aims at Lyft, Shift and load any type of data instantly on Snowflake. It is a Low/No Code Automatic ANSI SQL Data Pipeline. Lyftrondata offers you to choose your most valuable data and pulls it from all your connected data sources in just a fewer clicks. It is easy to set up, be up and move in minutes without any assistance from IT developers.

Let’s look at some of the characteristics of Lyftrondata:

Fully Automated: You do not need any professional assistance because Lyftrondata is a completely automated platform.
Connectors Support: Lyftrondata supports 300+ Integrations to SaaS platforms like FTP/SFTP, Files, Databases, BI tools, and Native REST API & Webhooks Connectors. It supports various destinations including Google BigQuery, Amazon Redshift, Snowflake, Firebolt, Data Warehouses; Amazon S3 Data Lakes; Databricks; MySQL, SQL Server, TokuDB, DynamoDB, PostgreSQL Databases, and many more such names.
Secure: Lyftrondata has an effortless architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
Data Analysis: Analyze massive volumes of this real-time data in visualization tools and get instant answers to your store performance.
Live Monitoring: Advanced monitoring gives you a one-stop view to watch all the activities that occur within Data Pipelines.
Live Support: Lyftrondata team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Real-Time: Lyftrondata offers real-time data migration. So, your data is always ready for analysis.

A four-step procedure for setting up Snowflake CI-CD using Azure DevOps and Flyway:

Step 1: Create a Demo: This is the first and foremost step, to begin with building the Snowflake CI/CD Pipeline which requires creating a demo project using Azure DevOps:

SNOWFLAKE_JDBC_URL=jdbc:snowflake://
SNOWFLAKE_ACCOUNT_NAME=..snowflakecomputing.com
SNOWFLAKE_WAREHOUSE=
SNOWFLAKE_ROLENAME=sysadmin
SNOWFLAKE_DEVOPS_USERNAME=
# mark as a secret variable type
SNOWFLAKE_DEVOPS_SECRET=
SNOWFLAKE_AUTHENTICATOR=snowflake

Step 2: Create a production environment:

Step 3: Generate a library for a variable group:

-- Create Databases
CREATE DATABASE FLYWAY_DEMO COMMENT = 'Azure DevOps deployment test';
CREATE DATABASE FLYWAY_DEMO_DEV COMMENT = 'Azure DevOps deployment test';
CREATE DATABASE FLYWAY_DEMO_QA COMMENT = 'Azure DevOps deployment test';

-- Create a Deploy User
create user devopsuser password='<mypassword>' default_role = sysadmin;

Step 4: Create and Run the Snowflake CI CD Pipeline:

variables:
- group: Snowflake.Database
- name: DBNAME
  value: flyway_demo
- name: flywayartifactName
  value: DatabaseArtifacts  
- name: flywayVmImage
  value: 'ubuntu-16.04'  
- name: flywayContainerImage
  value: 'kulmam92/flyway-azure:6.2.3'  
trigger:
- master

stages:
- stage: Build
  variables:
  - name: DBNAME_POSTFIX
    value: _DEV
  jobs:
  - template: templates/snowflakeFlywayBuild.yml
    parameters:
      jobName: 'BuildDatabase'
      databaseName: $(DBNAME)
      databasePostfix: $(DBNAME_POSTFIX)
      artifactName: $(flywayartifactName)
      vmImage: $(flywayVmImage)
      containerImage: $(flywayContainerImage)

- stage: DEV
  variables:
  - name: DBNAME_POSTFIX
    value: _DEV
  jobs:
  - template: templates/snowflakeFlywayDeploy.yml
    parameters:
      jobName: DEV
      databaseName: $(DBNAME)
      databasePostfix: $(DBNAME_POSTFIX)
      artifactName: $(flywayartifactName)
      vmImage: $(flywayVmImage)
      containerImage: $(flywayContainerImage)
      environmentName: DEV

- stage: QA
  variables:
  - name: DBNAME_POSTFIX
    value: _QA
  jobs:
  - template: templates/snowflakeFlywayDeploy.yml
    parameters:
      jobName: QA
      databaseName: $(DBNAME)
      databasePostfix: $(DBNAME_POSTFIX)
      artifactName: $(flywayartifactName)
      vmImage: $(flywayVmImage)
      containerImage: $(flywayContainerImage)
      environmentName: QA

- stage: PROD
  variables:
  - name: DBNAME_POSTFIX
    value: '' # Empty string for PROD
  jobs:
  - template: templates/snowflakeFlywayDeploy.yml
    parameters:
      jobName: PROD
      databaseName: $(DBNAME)
      databasePostfix: $(DBNAME_POSTFIX)
      artifactName: $(flywayartifactName)
      vmImage: $(flywayVmImage)
      containerImage: $(flywayContainerImage)
      environmentName: PROD

Kindly follow every step thoroughly then you will be able to Build the Snowflake CI CD pipeline using Azure DevOps and Flyway from scratch.

Conclusion

For hassle-free and sustainable generation and storage of databases, this article is efficiently explaining the simplified four-step process for enabling the Snowflake CI CD pipeline using Azure DevOps and Flyway. Read the above article for a detailed understanding and guidelines to proceed. Our team at Lyftrondata is always there to provide you with the best guidance and help you out with the most valuable data-related information.

Managing large volumes of Databases can be challenging at times, especially when the organization is at its growing stage or the business is already huge. A massive amount of structured and unstructured data piles up and it becomes difficult to store and analyze. However, you no longer need to worry about that because managing your database storage and analyzing them can be done by Snowflake’s Cloud-Based ELT tool like Lyftrondata.

Lyfrondata is a low/no-code Automatic ANSI SQL Data Pipeline. Lyftrondata offers you seamless data integration and enables you to store and manage your database in Snowflake within fewer clicks and a couple of minutes. It has 300+ integrations (including Snowflake, Google BigQuery, Amazon Redshift, etc.) Lyftrondata offers an entirely automated interface that does not require any professional assistance and this makes it affordable for growing businesses.

Sign Up Today with Lyftrondata and experience the new ways of handling your database. Do let us know your experience by commenting down below.

FAQs

What is Git?

Git is a locally running software (a version control system) or a DevOps tool which is used for tracking the changes in source of codes, help developers to coordinate, inshort it basically is a code management system.

What is Dual-Factor Authentication?

Dual-factor Authentication is a part of Multi factor Authentication which provides security to the website or the applications by granting them the safeguarding feature that allows access only after passing the authentication. It is a popular and effective type of digital security.

What is Federated Authentication?

Federated Authentication is a type of security major which involves identification through the user at their digital access.

What is SSO?

SSO means Single sign on, which is a type of authentication scheme which lets the user log in using single ID credentials to multiple independent software applications/systems.

What are Semi-Structured and Structured Data? What is the basic difference between them both?

Structured data is organised by means of Relational Database. Semi structured data is partially organised by the means of XML/RDF while in case of unstructured data it is based on simple characters and binary data.

What is VARIANT data type?

VARIANT means a tagged universal type that can hold up to 16 MB of any data type supported by Snowflake. Variants are stored as columns in relational tables. Array is a list-like indexed data type that consists of variant values.

What is the READ data type?

READ type data are DECIMAL , NUMERIC. INT , INTEGER , BIGINT , SMALLINT , TINYINT , BYTEINT.

What is Site Reliability Engineering?

SRE is what you get when you treat operations as if it's a software problem. Our mission is to protect, provide for, and progress the software and systems.

What do you mean by SaaS?

Software as a service (or SaaS) is a way of delivering applications over the Internet—as a service. Instead of installing and maintaining software, you simply access it via the Internet, freeing yourself from complex software and hardware management.

What is open-source?

Open source means code that is designed to be publicly accessible—anyone can see, modify, and distribute the code as they see fit.

What is Apache license 2.0?

Apache License 2.0 makes sure that the user does not have to worry about infringing any patents by using the software. The user is granted a license to any patent that covers the software.

What do you mean by Azure Repository?

Azure repository a set of version control tools that we can use to manage our code. It entirely new to version control, then version control enables us to track changes we make in our code over time.

Define Java code?

Java code is an object-oriented programming language that produces software for multiple platforms. When a programmer writes a Java application, the compiled code (known as bytecode) runs on most operating systems (OS), including Windows, Linux and Mac OS.

Define SQL script?

It is a set of SQL commands saved as a file in SQL Scripts. It can be used SQL Scripts to create, edit, view, run, and delete database objects.

What do you mean by Low/No Code?

It means types of visual software development environments that allow enterprise developers and citizen developers to drag and drop application components, connect them together and create mobile or web apps.

Define Automatic ANSI SQL Data Pipeline.

ANSI SQL Data Pipeline allows organizations to extract data at its source, transform it, integrate it with other sources and fuel business applications and data analytics.

What is YAML?

YAML means a data serialization language that is often used for writing configuration files.

What is CI/CD?

CI/CD means Continuous Integration and Continuous Delivery Pipeline helps in building codes, testing them, and deploying that into a single ongoing process by ensuring that the main branch codes are releasable for Snowflake.

How do you set up CI/CD pipeline for Snowflake?

For setting up Snowflake CI/CD pipeline using Azure DevOps and FlyWay, you need to follow this 4-step process:

Step 1: Create a Demo

Step 2: Set up Production Environment

Step 3: Create a Library Variable Group

Step 4: Create and Run a Snowflake CI/CD Development Pipeline.

For detailed guidance read the above blog for reference.

What is Snowflake DevOps?

Snowflake Cloud Data Warehouse is a multi-cluster shared data architecture that enables developers to build limitless data-intensive applications over concurrency, and performance scale. It delivers a fast response time regardless of the storage or heaviness of the database, also the scale supports both vertically and horizontally as per the requirements.

How do you set up CI/CD pipeline for Snowflake using Jenkins and Sqitch?

For creating CI/CD pipeline for snowflake using Jenkins and sqitch:

Add the webhook: If you use Jenkins and you build an automatic trigger on push-on, then add a webhook into the GitHub repository for the Jenkins server.
Create a repository for the Sqitch SQL scripts. Add the repository to your Jenkins installation.

How can you build Snowflake CI CD Pipeline Using Schema Change & Github Action?

For building Snowflake CI/CD Pipeline using Schema Change and GitHub Actions, we need to follow the steps: