
Toronto, Canada · Open to opportunities
Bipina Poudel
Senior Data Engineer · CAPM®
Senior Data Engineer with 7+ years of experience designing and scaling cloud-based data platforms across marketing, healthcare, and financial services. Specialized in ETL/ELT pipelines, real-time data processing, data warehousing, and data modeling on AWS and Azure. Focused on data quality, governance, and cost-optimized solutions that power analytics, reporting, and machine learning.
Experience
Where I've worked and what I've built
Senior Data Engineer
Skill Squirrel · Toronto, Canada
Engineered scalable ETL workflows to process large volumes of marketing and operational data, supporting daily campaign execution in enterprise data warehouse and ODS environments.
- ▸Automated ingestion of structured and semi-structured data (JSON, flat files, APIs, FTP/SFTP) to eliminate manual file management
- ▸Built SQL-based transformation workflows to produce campaign-ready audience and segmentation data
- ▸Optimized warehouse queries using indexing, partitioning, and dimensional modeling to improve reporting responsiveness
- ▸Developed reusable, curated tables and views to standardize business logic for marketing analytics teams
- ▸Structured logging and reconciliation processes to eliminate recurring production pipeline failures
- ▸Maintained Hive and relational warehouse tables to support daily campaign execution and operational reporting
Environment: AWS, S3, Glue, Spark, Kafka, Snowflake/Redshift, Python, SQL, Airflow, Jenkins
Data Engineer
Cedar Health Group · Greenwich, CT
Designed and built scalable ETL/ELT pipelines in Databricks using PySpark, Spark SQL, and Delta Lake to process high-volume healthcare datasets in a HIPAA-compliant environment.
- ▸Implemented medallion architecture (bronze, silver, gold layers) for claim, pricing, and membership policies
- ▸Applied Delta Lake features including ACID transactions, schema evolution, CDC, and optimized file compaction
- ▸Enforced PHI-compliant data architecture using Unity Catalog, access controls, and governance guardrails
- ▸Optimized Spark jobs with partition strategies, broadcast joins, caching, and Z-order indexing
- ▸Developed slowly changing dimensions to maintain historical claims and member tracking
- ▸Supported cost-efficient lakehouse operations using cluster sizing optimization and workload management
Environment: Azure Data Factory, ADLS Gen2, Databricks, Spark, PySpark, SQL, Snowflake, Airflow, Power BI
Data Engineer
Queue · San Francisco, CA
Implemented ETL processes and dimensional data models for financial reporting, investment analysis, and regulatory compliance in the fintech domain.
- ▸Developed dimensional data models (star and snowflake schema) for financial reporting and investment analysis
- ▸Constructed and optimized data pipelines utilizing Snowflake and AWS Redshift
- ▸Authored complex SQL code for transformations, views, and stored procedures for regulatory and risk reporting
- ▸Supported migration of on-premise data pipelines to AWS S3 and Redshift
- ▸Implemented role-based access controls and encryption standards to ensure data security (SOX compliance)
- ▸Conducted data reconciliation and validation to ensure accuracy in financial data
Environment: Informatica, AWS S3, Redshift, Oracle, SQL Server, SQL, Python, Tableau
Skills & Technologies
Tools I work with day to day
Programming & Scripting
Big Data & Processing
Cloud Platforms
Databases & Warehousing
ETL / Data Integration
Streaming & Messaging
DevOps & CI/CD
Data Modeling & Governance
Visualization & Analytics
Projects
Things I've built and problems I've solved
Healthcare Data Lakehouse (Medallion Architecture)
Built a PHI-compliant medallion lakehouse (bronze/silver/gold) on Azure Databricks for Cedar Health Group, processing high-volume claim, pricing, and membership datasets with Delta Lake ACID transactions.
Marketing Data Pipeline Platform
Engineered scalable ETL workflows at Skill Squirrel for processing marketing and campaign operational data. Automated ingestion of JSON, flat files, and API sources into Snowflake/Redshift.
Financial Data Warehouse on AWS
Migrated on-premise financial data pipelines to AWS S3 and Redshift at Queue. Built star/snowflake schema models for regulatory reporting and investment analysis with SOX compliance.
Real-time Campaign Audience Segmentation
Developed Kafka-based streaming pipeline for real-time audience segmentation and targeting data, enabling same-day campaign execution across enterprise marketing systems.
Spark Performance Optimization Framework
Systematically optimized PySpark jobs using partition strategies, broadcast joins, caching, and Z-order indexing — significantly reducing heavy job runtimes on Azure Databricks.
Data Quality & Reconciliation System
Built structured logging and reconciliation pipelines to track data accuracy end-to-end, eliminating recurring production incidents and improving SLA adherence for downstream teams.
More on github.com/bipinapoudel
Certifications
Professional credentials and qualifications
AWS Certified Data Engineering – Associate
Amazon Web Services
Microsoft Certified: Azure Fundamentals
Microsoft
Certified Associate in Project Management (CAPM)®
Project Management Institute
Education
Academic background
P.G. in Project Management – IT
Seneca College
P.G. in Cyber Security and Threat Management
Seneca College
B.S. in Computing (Honours) – Information Technology
Leeds Beckett University
Let's work together
I'm open to data engineering roles, freelance projects, and collaborations. Whether it's a pipeline problem or a full platform build — let's talk.