← Back to Home

All Projects

A complete archive of data engineering work — pipelines, batch systems, cloud architectures, and analytics platforms.

8 Projects

Retail Payment Intelligence Pipeline

Production-grade retail analytics pipeline using synthetic e-commerce data and a full Medallion architecture — from raw ingestion to Looker Studio dashboards.

  • dlt
  • dbt
  • BigQuery
  • Airflow
  • Looker Studio
View on GitHub →

E-Commerce Sales Analytics Pipeline

Fully automated pipeline extracting Kaggle sales data via dlt, loading into BigQuery, and transforming to a business-ready Gold layer with dbt — orchestrated by Airflow.

  • dlt
  • dbt
  • BigQuery
  • Airflow
  • Python
View on GitHub →

AWS Spark Batch

Automated batch data pipeline ingesting S&P 500 ETF data from yfinance into AWS S3 via Medallion architecture, transformed with Apache Spark and visualised in Superset.

  • Apache Spark
  • Airflow
  • AWS S3
  • PostgreSQL
  • Superset
View on GitHub →

ETF Batch Pipeline — Apache Spark, Airflow & Superset

Daily batch pipeline ingesting sector ETF data, computing financial KPIs (moving averages, returns) using Spark, stored on S3 with Medallion layers and visualized live.

  • Apache Spark
  • Airflow
  • AWS S3
  • Apache Superset
  • yfinance
View on GitHub →

Big Data Pipeline — Startup Funding Analytics

Big data ETL pipeline over Kaggle startup funding datasets, loaded into PostgreSQL and surfaced via an interactive Streamlit dashboard — fully containerized with Docker Compose.

  • Python
  • PostgreSQL
  • Streamlit
  • Docker
  • Kaggle
View on GitHub →

Live CRM Pipeline

Real-time CRM data pipeline integrating the Affinity API to ingest deals and interactions, compute relationship warmth scores, and surface actionable CRM insights.

  • Python
  • Affinity API
  • PostgreSQL
View on GitHub →

YouTube Content Intelligence Pipeline

Data pipeline that ingests YouTube channel and video metadata, processes content signals, and surfaces intelligence for content strategy and audience growth analysis.

  • Python
  • YouTube Data API
  • Pandas
View on GitHub →

GCP Weather Data Pipeline

Cloud-native weather data pipeline on Google Cloud Platform, streaming live meteorological data through automated ingestion and transformation workflows.

  • Python
  • GCP
  • Cloud Storage
  • BigQuery
View on GitHub →