Toronto, Canada · Open to opportunities

Bipina Poudel

Senior Data Engineer · CAPM®

Senior Data Engineer with 7+ years of experience designing and scaling cloud-based data platforms across marketing, healthcare, and financial services. Specialized in ETL/ELT pipelines, real-time data processing, data warehousing, and data modeling on AWS and Azure. Focused on data quality, governance, and cost-optimized solutions that power analytics, reporting, and machine learning.

GitHub LinkedIn Get in touch

7+Years Experience

3Industries

2Cloud Platforms

3Certifications

Experience

Where I've worked and what I've built

Senior Data Engineer

Skill Squirrel · Toronto, Canada

Feb 2024 – Present

Engineered scalable ETL workflows to process large volumes of marketing and operational data, supporting daily campaign execution in enterprise data warehouse and ODS environments.

▸Automated ingestion of structured and semi-structured data (JSON, flat files, APIs, FTP/SFTP) to eliminate manual file management
▸Built SQL-based transformation workflows to produce campaign-ready audience and segmentation data
▸Optimized warehouse queries using indexing, partitioning, and dimensional modeling to improve reporting responsiveness
▸Developed reusable, curated tables and views to standardize business logic for marketing analytics teams
▸Structured logging and reconciliation processes to eliminate recurring production pipeline failures
▸Maintained Hive and relational warehouse tables to support daily campaign execution and operational reporting

Environment: AWS, S3, Glue, Spark, Kafka, Snowflake/Redshift, Python, SQL, Airflow, Jenkins

Data Engineer

Cedar Health Group · Greenwich, CT

June 2021 – Jan 2024

Designed and built scalable ETL/ELT pipelines in Databricks using PySpark, Spark SQL, and Delta Lake to process high-volume healthcare datasets in a HIPAA-compliant environment.

▸Implemented medallion architecture (bronze, silver, gold layers) for claim, pricing, and membership policies
▸Applied Delta Lake features including ACID transactions, schema evolution, CDC, and optimized file compaction
▸Enforced PHI-compliant data architecture using Unity Catalog, access controls, and governance guardrails
▸Optimized Spark jobs with partition strategies, broadcast joins, caching, and Z-order indexing
▸Developed slowly changing dimensions to maintain historical claims and member tracking
▸Supported cost-efficient lakehouse operations using cluster sizing optimization and workload management

Environment: Azure Data Factory, ADLS Gen2, Databricks, Spark, PySpark, SQL, Snowflake, Airflow, Power BI

Data Engineer

Queue · San Francisco, CA

Oct 2018 – May 2021

Implemented ETL processes and dimensional data models for financial reporting, investment analysis, and regulatory compliance in the fintech domain.

▸Developed dimensional data models (star and snowflake schema) for financial reporting and investment analysis
▸Constructed and optimized data pipelines utilizing Snowflake and AWS Redshift
▸Authored complex SQL code for transformations, views, and stored procedures for regulatory and risk reporting
▸Supported migration of on-premise data pipelines to AWS S3 and Redshift
▸Implemented role-based access controls and encryption standards to ensure data security (SOX compliance)
▸Conducted data reconciliation and validation to ensure accuracy in financial data

Environment: Informatica, AWS S3, Redshift, Oracle, SQL Server, SQL, Python, Tableau

Skills & Technologies

Tools I work with day to day

Programming & Scripting

PythonPySparkSQLScalaShell Scripting

Big Data & Processing

Apache SparkSpark SQLHadoopHiveKafkaAirflow

Cloud Platforms

AWS (S3, Glue, Redshift, Lambda, EMR, Athena, IAM)Azure (ADF, ADLS Gen2, Synapse Analytics, Databricks)

Databases & Warehousing

SnowflakeRedshiftAzure SynapsePostgreSQLMySQLOracleSQL Server

ETL / Data Integration

AWS GlueAzure Data FactoryInformaticaDatabricksTalend

Streaming & Messaging

Apache KafkaAWS Kinesis

DevOps & CI/CD

GitJenkinsAzure DevOpsTerraformDocker

Data Modeling & Governance

Dimensional ModelingStar/Snowflake SchemaData QualityData LineageHIPAASOX

Visualization & Analytics

Power BITableauLooker

Projects

Things I've built and problems I've solved

Featured

Healthcare Data Lakehouse (Medallion Architecture)

Built a PHI-compliant medallion lakehouse (bronze/silver/gold) on Azure Databricks for Cedar Health Group, processing high-volume claim, pricing, and membership datasets with Delta Lake ACID transactions.

DatabricksDelta LakePySparkAzureUnity Catalog

Featured

Marketing Data Pipeline Platform

Engineered scalable ETL workflows at Skill Squirrel for processing marketing and campaign operational data. Automated ingestion of JSON, flat files, and API sources into Snowflake/Redshift.

AWS GlueKafkaSnowflakeAirflowPython

Financial Data Warehouse on AWS

Migrated on-premise financial data pipelines to AWS S3 and Redshift at Queue. Built star/snowflake schema models for regulatory reporting and investment analysis with SOX compliance.

AWS RedshiftS3InformaticaSQLTableau

Real-time Campaign Audience Segmentation

Developed Kafka-based streaming pipeline for real-time audience segmentation and targeting data, enabling same-day campaign execution across enterprise marketing systems.

KafkaSpark StreamingPythonRedshift

Spark Performance Optimization Framework

Systematically optimized PySpark jobs using partition strategies, broadcast joins, caching, and Z-order indexing — significantly reducing heavy job runtimes on Azure Databricks.

PySparkSpark SQLDatabricksDelta Lake

Data Quality & Reconciliation System

Built structured logging and reconciliation pipelines to track data accuracy end-to-end, eliminating recurring production incidents and improving SLA adherence for downstream teams.

PythonSQLJenkinsAirflow

Certifications

Professional credentials and qualifications

☁️

AWS Certified Data Engineering – Associate

Amazon Web Services

🔷

Microsoft Certified: Azure Fundamentals

Microsoft

📋

Certified Associate in Project Management (CAPM)®

Project Management Institute

Education

Academic background

P.G. in Project Management – IT

Seneca College

Toronto, Canada

P.G. in Cyber Security and Threat Management

Seneca College

Toronto, Canada

B.S. in Computing (Honours) – Information Technology

Leeds Beckett University

Kathmandu, Nepal

Let's work together

I'm open to data engineering roles, freelance projects, and collaborations. Whether it's a pipeline problem or a full platform build — let's talk.

View GitHub poudelbipina11@gmail.com LinkedIn Profile