Snowflake
Module 1: Introduction
- What is ETL?
- ETL Vs ELT.
- Data warehouse basics
- Star Model
- Snowflake Model
- SCD Types
Module 2: Snowflake Setup
- Introduction to Snowflake
- Snowflake Architecture
- Overview of Snowflake special features.
- Trial account setup.
- User interface walkthrough.
Module3: Snowflake Administration
- Object Hierarchy (Database, Schema, Warehouse creation)
- Role Hierarchy (Accountadmin, sysadmin, securityadmin)
- Role-based access control (RBAC)
- Custom role creation
- Object Privileges
- Bulk grants
- Account-level Privileges.
- Parent–child role hierarchy.
- Database roles.
- User-based access control (UBAC)
Module 4: Data Ingestion in Snowflake
- Bulk loading using S3 location.
- Bulk loading using External stage.
- COPY INTO options
- Continuous loading using snowpipe.
- Setting up failure notification for snowpipe.
- Monitoring cost of Snowpipe.
- Loading data from table stage
- Loading data from user stage
- Loading data from internal named stage.
- Exporting the data from snowflake to S3
- Exporting data from snowflake to table stage, user stage and internal named stage.
- Data export options.
Module 5: Data Transformation in snowflake
- Streams in Snowflake
- Implementing SCD type 1 using streams
- Implementing SCD type 2 using streams
- Tasks in snowflake
- Serverless task in Snowflake
- Monitoring Tasks
- Setting up failure notification for tasks.
- JavaScript stored procedures
- SQL based stored procedures
- Javascript User defined functions.
- SQL User-defined functions.
- Dynamic tables in snowflake.
- Generating the ddl of object and upload to s3 (real-time usecase)
- Dynamic generation of merge statement (real-time usecase)
- Dynamic generation of copy command (real-time usecase)
Module 6: Snowflake special features.
- Introduction of Time Travel
- Data Life Cycle and Time Travel operations
- Time Travel SQL Extensions, Parameters, Offset
- Timestamp, Statement, UNDROP
- Fail-Safe Storage
- Transient tables.
- Retrieving the Historical Data
- Introduction of Cloning
- Cloning of Tables
- Cloning Database and Schemas
- Cloning considerations
- Querying the Space for Table, Time Travel and Fail Safe
- Introduction to Data Sharing, Reader Account
- Inbound and Outbound Shares
- Data Sharing with External snowflake Accounts
- Data Marketplace (Inbound, Outbound)
- Secure Views
- Introduction of Materialized Views
- Refreshing the Materialized views
- Create and Query the Data from the External Tables
- Querying Parquet Data.
- Querying JSON Data.
- Querying XML Data.
- Data masking in snowflake.
- Metadata tables in snowflake.
- Virtual Warehouses in snowflake.
- Scale-up Vs Scale-out.
- Caching in snowflake.
- Result cache
- VW cache
- Metadata Cache
- Micro-partitions in snowflake
- Cluster keys in Snowflake.
- Resource Monitors in snowflake.
- Hybrid tables.
Data Build Tool
Module 7:
- DBT Introduction
- DBT Usage
- Setup DBT core
- Set up DBT cloud
- profiles.yml file
- dbt_projects.yml file
Module 8:
- Models Overview
- View Materialization
- Table Materialization
- Incremental Materializations
- Ephemeral Materializations
- Table Materialization Vs Project Materialization
- Pros and Cons of Materialization
- Pre-hooks & post-hooks
- Configurations
- Alias
- Tag
- query_tag
- schema
Module 9:
- Jinja templates overview
- Jinja templates use cases
- Ref jinja function
- Source jinja function
Module 10:
- Predefined Macros
- Custom schema definition macro.
- Macros custom use cases
- Audit macro use case
- Duplicate records removal use case
Module 11:
- Snapshot’s introduction
- Snapshots with timestamp strategy
- Snapshots with check columns strategy
- Seeds
- Best Practices for Seeds
- Sources
- Analyses
- Exposures
Module 12:
- Testing Overview
- Generic Testing
- Custom Generic Testing
- Singular Testing
- Models.yml file
- Sources.yml file
Module 13:
- Deployment
- Debugging
- Best Practices
Python
Module 14: Python integration with snowflake.
- Python set-up
- Data structures in Python.
- Loops and branch control statements.
- Exception handling.
- Boto3 module.
- Snowflake connector module.
- Psycopg2 module
- Snowpipe failure notification (real-time use case)
- Task failure notification (real-time use case)
- Data validation between postgres and snowflake (real-time use case)
AWS
Module 15: AWS integration with snowflake.
- What are AWS
- Trial account set-up
- Identity and Access Management (IAM)
- Storage services: S3
- Athena
- Compute service: EC2
- Relational Database Service (RDS)
- Simple Notification Service (SNS)
- Data Migration Service (DMS).
- Lambda
AIRFLOW
Module 16: Airflow integration with snowflake.
- What is Airflow
- What is Directed Acyclic Graph
- WebServer in Airflow
- Scheduler in Airflow
- Scheduling a sample job in Airflow.
- Python Operator
- Bash Operator
- Monitoring and logging the Jobs
- Graph view of the DAG.
Projects
Module 17: Projects
- Building data pipeline from postgres using snowflake, dbt and Airflow
- Building data pipeline from API using snowflake, dbt and Airflow
- interview preparation
- Resume preparation
- SnowPro Core certification preparation.
