Data Engineer | Freelance Portfolio

Services & Expertise

End-to-end data engineering: from raw ingestion to dashboards, APIs, and ML-ready data products.

⚡

ETL & ELT Pipelines

Production-grade ingestion pipelines with data validation, incremental loads, backfills, Reverse ETL, and schema evolution. From raw sources to warehouse-ready tables, reliably.

Python
dlt
Airflow
Kafka
Reverse ETL
CDC
Incremental Loads

🏗️

Data Modeling & Transformation

Dimensional models, star schemas, and Medallion-layered transformations (Bronze to Silver to Gold). Converts raw data into query-optimized, analytics-ready assets with full lineage tracking.

dbt
Ad-Hoc SQL
Dimensional Modeling
Star Schema
Data Quality
Data Lineage
Data Governance

☁️

Cloud Data Warehousing & Dashboards

Scalable cloud warehouses with partitioning, clustering, and row-level security. Connected to live dashboards and self-serve BI layers built for non-technical stakeholders.

BigQuery
Snowflake
Redshift
Looker Studio
Apache Superset
Dashboards
Query Optimization

🤖

Automation, APIs & ML Integration

REST and webhook APIs, autonomous agents, and ML model integration layers that operationalize machine learning outputs into downstream data pipelines and business systems.

REST APIs
FastAPI
ML Models
LangChain
OpenAI APIs
Docker
Webhooks

Skills & Tools

The full stack I use to build production data systems, from ingestion to dashboards.

Languages

Python, SQL, Ad-Hoc SQL, JavaScript, TypeScript, Java, C

Data Engineering

ETL / ELT, Reverse ETL, Data Modeling, Dimensional Modeling, Transformation, dbt, dlt, Airflow, Kafka, Apache Spark, CDC, Orchestration

Warehouses & Analytics

BigQuery, Snowflake, Redshift, PostgreSQL, Star Schema, Data Lineage, Data Quality, Query Optimization, Partitioning, Clustering

Dashboards & APIs

Looker Studio, Apache Superset, Streamlit, REST APIs, FastAPI, Webhooks, ML Models, LangChain, OpenAI

Cloud & Governance

GCP, AWS S3, Docker, Data Governance, Data Observability, Data Catalog, CI/CD, Row-level Security

Featured Work

Real-world projects spanning end-to-end pipelines, data modeling, transformation layers, dashboards, and cloud architectures.

Retail Payment Intelligence Pipeline

Production-grade retail analytics pipeline using synthetic e-commerce data and a full Medallion architecture — from raw ingestion to Looker Studio dashboards.

dlt
dbt
BigQuery
Airflow
Looker Studio

View on GitHub →

E-Commerce Sales Analytics Pipeline

Fully automated pipeline extracting Kaggle sales data via dlt, loading into BigQuery, and transforming to a business-ready Gold layer with dbt — orchestrated by Airflow.

dlt
dbt
BigQuery
Airflow
Python

View on GitHub →

AWS Spark Batch

Automated batch data pipeline ingesting S&P 500 ETF data from yfinance into AWS S3 via Medallion architecture, transformed with Apache Spark and visualised in Superset.

Apache Spark
Airflow
AWS S3
PostgreSQL
Superset

View on GitHub →

ETF Batch Pipeline — Apache Spark, Airflow & Superset

Daily batch pipeline ingesting sector ETF data, computing financial KPIs (moving averages, returns) using Spark, stored on S3 with Medallion layers and visualized live.

Apache Spark
Airflow
AWS S3
Apache Superset
yfinance

View on GitHub →

Explore More Work →

Ready to build a reliable data foundation?

Let's discuss how customized data architecture can accelerate your business.

Email Me LinkedIn Google Scholar GitHub Upwork

AWS & GCP Certified Data Engineer

Hi, I'm
Mesum.

Services & Expertise

ETL & ELT Pipelines

Data Modeling & Transformation

Cloud Data Warehousing & Dashboards

Automation, APIs & ML Integration

Skills & Tools

Languages

Data Engineering

Warehouses & Analytics

Dashboards & APIs

Cloud & Governance

Featured Work

Retail Payment Intelligence Pipeline

E-Commerce Sales Analytics Pipeline

AWS Spark Batch

ETF Batch Pipeline — Apache Spark, Airflow & Superset

Ready to build a reliable data foundation?