You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This project simulates a real-world enterprise data migration and modernization strategy. It extracts transactional data from a simulated "On-Premise" environment (hosted on AWS EC2), performs heavy distributed processing using a Hadoop/Spark cluster, and ultimately serves the data via a Cloud-Native, serverless architecture to optimize costs .
A Data Warehousing project for retail sales using dimension modelling best practices with SCD type 2 on AWS Redshift. Utilizing AWS Lambda, Glue Workflows and Python Shell jobs to create and automate an ELT pipeline where batch data coming into S3 is loaded onto Redshift and necessary transformations are performed to meet requirements.
End-to-end Azure Databricks retail data engineering project using Medallion Architecture (Bronze, Silver, Gold). Implements Auto Loader, Unity Catalog, Delta Lake, SCD Type 1 & 2 dimensions, and Fact Orders for analytics-ready star schema modeling.
In this project we'll create real time healthcare patient data pipeline as data source and use arious services and tools like Azure Eventhubs, Azure Databricks, Delta lake and synapse analytics. also, implement medallion architecture, schema evolution and create facts and dimension tables and connect the cleaned and transformed data to PowerBI.
End-to-end data engineering project using AWS S3, Snowflake, and dbt to implement Medallion Architecture with SCD Type 1 & Type 2 logic on Walmart sales data, followed by analytical visualizations using Seaborn and Plotly.
End-to-end data engineering project using AWS S3, Snowflake, and dbt to implement Medallion Architecture with SCD Type 1 & Type 2 logic on Walmart sales data, followed by analytical visualizations using Seaborn and Plotly.
Design and implement a full ELT data pipeline using Snowflake and S3, featuring star schema modelling, SCD Type 1 & 2 handling, and incremental load automation
Production-style Slowly Changing Dimension (SCD Type 2) pipeline built with Snowflake, dbt, and AWS S3. Demonstrates secure S3 ingestion, layered bronze/silver/gold modeling, dbt snapshots for historical tracking, and analytics-ready views identifying active vs historical records.