Available for new opportunities

Abul
Hasan Fahad

Grid AI · Electric Utilities IT/OT · Data Engineering

Building AI solutions for Electric Utilities IT/OT. Power systems engineer turned production AI/data leader — from IEEE-published grid research at Waterloo to deploying LLM pipelines at Hydro One. Ex–Big 5 banks.

4
Peer-reviewed
publications
10+
Years in data
& power systems
UW
University of
Waterloo ECE
ON
Based in
Mississauga, Canada

Power systems roots.
AI engineering present.

I'm a Data & AI Engineer with a rare combination: a formal electrical engineering background in power systems (BUET, Waterloo ECE) and deep hands-on experience building production ML and data pipelines in industry.

My research spans power quality analysis, EV wireless charging, connected autonomous vehicles, and machine learning for AC optimal power flow — all published and cited in IEEE venues and Springer.

My career spans electric utilities (Hydro One), Big 5 banking (RBC), and academic research — giving me an unusually broad perspective on production AI systems, regulated data environments, and critical infrastructure.

Power Systems / EE domain
🧠 LLM & ML Pipelines AI
❄️ Snowflake · dbt · Azure data platform
🐍 Python · SQL · REST APIs engineering
📊 Power BI · Analytics bi
🔬 MATLAB · Simulink research
01

Grid AI & ML

Applying machine learning to power system problems — from ML for AC optimal power flow (IEEE-9 bus) to EV charging infrastructure design. Bridging domain physics with data-driven models.

Optimal Power Flow EV Charging Power Quality
02

Production LLM Engineering

Built and deployed LLM-based NLP pipelines using locally-run Mistral 7B for intent classification on enterprise contact center data. Full lifecycle: prompt engineering, GGUF optimization, Parquet I/O.

Mistral 7B llama.cpp NLP Classification
03

Data Platform Engineering

Led cloud data migration and platform engineering on Azure + Snowflake, including medallion architecture, RBAC design, and ingestion pipelines from multiple API sources into a modern data warehouse.

Snowflake Azure dbt Airflow
Python
Fault-Ride-Through & Islanding Detection
Fault ride-through, islanding detection, and synchronization requirements for grid-connected inverters.
View on GitHub →
Python
data_centre_load_modeling
Power load modeling and analysis for data centre infrastructure planning.
View on GitHub →
Python
Power-Grid-Operation-Technology
Tools and scripts for power grid operations, monitoring, and technology integration.
View on GitHub →
MATLAB
Optimal-Charging-under-Vehicle-to-Grid
Optimal EV charging strategy under Vehicle-to-Grid (V2G) constraints and grid signals.
View on GitHub →
Python
LMP_Forecasting
Locational marginal price (LMP) forecasting for day-ahead electricity markets.
View on GitHub →
Python
Ontario-grid-mapping
Ontario electricity grid mapping and visualization using open-source geospatial data.
View on GitHub →
Python
GIS Data Centre Tracking (Canada)
GIS and satellite-based tracking of data centre locations and power footprints across Canada.
View on GitHub →
Python
Email_intent_categorization_project
Production LLM pipeline using Mistral 7B (llama.cpp) for email intent classification on enterprise contact center data.
View on GitHub →
PowerShell
Llama-cpp-python-installation-guide
Automated installation guide and scripts for llama.cpp + Python bindings on Windows environments.
View on GitHub →
Python
setup_azure_and_foundry
Setup and configuration scripts for Azure AI Foundry, including model deployment and endpoint wiring.
View on GitHub →
Python
feature_store
ML feature store implementation for managing, versioning, and serving features across model pipelines.
View on GitHub →
Python
extracting_data_from_freeform_text
NLP pipeline for structured data extraction from unstructured freeform text sources.
View on GitHub →
Python
cority_data_lake
End-to-end data lake pipeline ingesting EHS data from the Cority platform into a cloud data warehouse.
View on GitHub →
Python
Snowflake_api_ingester
REST API ingestion pipeline into Snowflake with batching, retry logic, and schema management.
View on GitHub →
Python
banking_etl
Banking data ETL pipeline for ingesting, transforming, and loading financial records into data platforms.
View on GitHub →
Python
spark_session_optimizer
Utility for tuning and optimizing Apache Spark session configurations based on workload profiles.
View on GitHub →
Python
ingest_query_optimization
Query optimization tooling for high-throughput data ingestion pipelines on distributed platforms.
View on GitHub →
Python
pyspark_etl_0001
PySpark ETL framework template for scalable batch data processing on Hadoop/cloud clusters.
View on GitHub →
Python
Chunked-sas-reader
Memory-efficient SAS7BDAT file reader that processes large datasets in configurable chunks.
View on GitHub →
Python
Sas7bdat_to_parquet_converter
Bulk converter from SAS7BDAT format to columnar Parquet for modern data lake ingestion.
View on GitHub →
Python
sas_streamer
Streaming processor for SAS datasets enabling low-memory pipeline integration without full file loads.
View on GitHub →
Python
sas_to_python_mlops_etl
Migration framework for converting legacy SAS ETL workflows into Python-based MLOps pipelines.
View on GitHub →
Python
sas_etl_to_hive
Tooling to port SAS ETL logic to Apache Hive / HiveQL for Hadoop-based enterprise data platforms.
View on GitHub →
Python
sas_2_py_migration
Framework for migrating SAS programs to Python, preserving logic and data transformations.
View on GitHub →
SAS
sas_data_checkers
SAS-based data validation and quality check scripts for pre-migration auditing of datasets.
View on GitHub →
Python
sas_file_similarity_checker
Compares SAS datasets for structural and content similarity to support deduplication during migration.
View on GitHub →
Python
sas_refactoring
Tools for refactoring and modernizing legacy SAS codebases before migration to Python or cloud platforms.
View on GitHub →
Python
residential-load-forecast
Day-ahead hourly load forecasting for residential electricity customers using ML models.
View on GitHub →
Jupyter
ELectricty-Market-Load-Forecaster
Day-ahead hourly load forecasting and data processing for electricity market operations.
View on GitHub →
Python
snl-quest
Energy storage simulation and analysis suite from Sandia National Laboratories — used for storage project evaluation.
View on GitHub →
Python
Egret
Open-source power systems optimization library for building and solving OPF and unit commitment problems.
View on GitHub →
Python
OpenTUMFlex
Open-source flexibility quantification model for distributed energy resources and smart grid applications.
View on GitHub →
Python
TESS
Transactive Energy Service System — market-based coordination of distributed energy resources.
View on GitHub →
Python
LEM_simulation
Local electricity market simulation integrating GridLAB-D with Python for transactive energy research.
View on GitHub →
JavaScript
REopt_Lite_API
NREL's REopt energy optimization API client for sizing and siting distributed energy systems.
View on GitHub →
Python
Power-system-tools-from-national-labs
Curated power system simulation and analysis tools from US national laboratories (NREL, PNNL, Sandia).
View on GitHub →
Python
Electro-magnetic-Transient-analysis-tools
EMT analysis tooling for simulating transient events in power systems — switching surges, faults, and disturbances.
View on GitHub →
Python
Resources-of-protection-coordination
Reference resources for power system protection coordination studies — relay settings, grading, and selectivity.
View on GitHub →
Python
WaterHeaterPythonModel
Thermal model of residential water heaters for demand response and grid flexibility studies.
View on GitHub →
Python
Power-and-Energy-tools-by-BigTech
Collection of power and energy tools released by major tech companies for grid and sustainability research.
View on GitHub →
C#
mslearn-openai
Azure OpenAI service integration labs — prompt engineering, embeddings, and LLM API usage patterns.
View on GitHub →
Python
Llama.cpp-demo-app
Demo application showcasing local LLM inference using llama.cpp on CPU-only Windows environments.
View on GitHub →
Python
Microsoft-copilot-studio-architecture
Internal architecture breakdown of Microsoft Copilot Studio — agent flows, connectors, and integration patterns.
View on GitHub →
Python
ottomator-agents
Open-source AI agents hosted on the oTTomator Live Agent Studio platform for production agent workflows.
View on GitHub →
Python
langchain-course
Hands-on LangChain course covering chains, agents, memory, RAG pipelines, and tool-use patterns.
View on GitHub →
Python
NLP_project
NLP pipeline for text classification, entity extraction, and structured output from enterprise text data.
View on GitHub →
Python
Aws_ml_platform_design
Architecture and design reference for an end-to-end ML platform on AWS — training, serving, and monitoring.
View on GitHub →
Python
Onprem_to_snowflake_via_azure
End-to-end migration pipeline moving on-premises data into Snowflake through Azure Data Factory.
View on GitHub →
Python
sybase_to_snowflake
Migration tooling for moving Sybase relational data into Snowflake with schema mapping and validation.
View on GitHub →
Python
S3_multipart_copy_sm_to_s3
High-throughput multipart S3-to-S3 copy utility using concurrent transfers for large dataset migrations.
View on GitHub →
Python
spark_hive
PySpark + Hive integration scripts for reading, writing, and managing Hive tables from Spark jobs.
View on GitHub →
Python
spark_etl_join_case
Optimized Spark ETL patterns for complex join scenarios — broadcast joins, skew handling, and partitioning.
View on GitHub →
Python
spark_udf_test
Testing framework for PySpark UDFs — unit test patterns, mocking, and validation against expected outputs.
View on GitHub →
Python
hdfs_merge
Utility for merging small HDFS files into larger Parquet files to address the small-file problem at scale.
View on GitHub →
Python
hdfs_path_checker
HDFS directory and file path validation tool for pipeline pre-flight checks and data availability monitoring.
View on GitHub →
Shell
parquet_merge
Shell scripts for merging Parquet partition files on HDFS or local filesystems.
View on GitHub →
Shell
Parquet_count
Lightweight shell tool for counting rows in Parquet files across HDFS paths without loading into Spark.
View on GitHub →
Python
file_cdc_linux
File-based Change Data Capture (CDC) implementation on Linux for detecting and propagating data changes.
View on GitHub →
Python
Etl_for_call_data
ETL pipeline for ingesting and processing enterprise call center data for downstream analytics.
View on GitHub →
Python
cassandra-scripts
Operational scripts for Apache Cassandra — schema management, data migration, and query tooling.
View on GitHub →
JavaScript
platform_inventory
Platform inventory tracker for cataloguing data infrastructure assets, services, and dependencies.
View on GitHub →
Python
data_pipeline_disaster_recovery_test
Test suite for validating disaster recovery procedures and failover behaviour in data pipeline infrastructure.
View on GitHub →
Shell
old_hdfs_file_detector
Detects stale or orphaned files on HDFS based on age thresholds — used for storage housekeeping.
View on GitHub →
Shell
file_matching_tool
File matching and reconciliation tool for comparing datasets across source and target systems.
View on GitHub →
Python
indicator_adder
Adds derived indicator columns to datasets during ETL — flag generation, null markers, and business rules.
View on GitHub →
HiveQL
handle_columns_in_json_in_hive
HiveQL patterns for parsing, flattening, and querying nested JSON columns in Hive tables.
View on GitHub →
Python
analyze_json_files
Schema inspection and analysis tool for JSON files — detects types, nesting depth, and field coverage.
View on GitHub →
Python
markdown_to_doc_and_ppt
Converts Markdown documents into Word and PowerPoint files — useful for automated reporting pipelines.
View on GitHub →
Python
pdf_in_dataiku
PDF ingestion and text extraction plugin for Dataiku DSS data science pipelines.
View on GitHub →
Python
optimization_thesis
Optimization models and code from graduate thesis work — stochastic and deterministic formulations for power systems.
View on GitHub →
Python
Diebold-Mariano-Test
Python implementation of the Diebold-Mariano (1995) test with Harvey et al. (1997) corrections for forecast accuracy comparison.
View on GitHub →
Jupyter
fpp3-python-readalong
Python-centered implementation of "Forecasting: Principles and Practice" (3rd ed.) — time series models and evaluation.
View on GitHub →
Python
stochastic-optimization
Sequential decision problem modeling library from Princeton's Castle Lab — stochastic DP and adaptive algorithms.
View on GitHub →
Python
Multiple-Criteria-Decision-Aid
MCDA methods (TOPSIS, ELECTRE, PROMETHEE) for multi-criteria evaluation in engineering and policy contexts.
View on GitHub →
Python
Test-stock-prediction-algorithms
Comparative evaluation of deep learning and genetic programming approaches for stock market movement prediction.
View on GitHub →
Python
TVGL-1
Time-Varying Graphical Lasso implementation for learning dynamic sparse covariance structures in time series data.
View on GitHub →
Python
Optimal-Income-Splitting
Optimization model for income splitting strategies to minimize household tax liability under Canadian tax rules.
View on GitHub →
Jupyter
practical-machine-learning-with-python
End-to-end ML with Python covering classification, regression, NLP, and deep learning on real-world datasets.
View on GitHub →
Jupyter
Predictive-Analytics-with-TensorFlow
TensorFlow-based predictive analytics covering regression, classification, and sequence modeling tasks.
View on GitHub →
```
2021
Future of Connected Autonomous Vehicles in Smart Cities
Book Chapter · ScienceDirect / Elsevier · with H. Gaber, A. Othman
Book Chapter
2019
Design of Test Platform of Connected-Autonomous Vehicles and Transportation Electrification
Advances in Intelligent Systems and Computing · Springer · with H. A. Gabbar, A. M. Othman
Springer
2019
Machine Learning for Optimal Power Flow
ECE 662 · University of Waterloo · AC OPF on IEEE-9 Bus Benchmark
Waterloo
2018
Wireless Flywheel-Based Fast Charging Station (WFFCS)
IEEE REPE 2018 · DOI: 10.1109/REPE.2018.8657482 · with H. A. Gabbar · 3 citations
IEEE
2014
A Voltage Flicker Severity Analysis Module for Multiple Electric Arc Furnace Operation
8th ICECE · IEEE · with P. P. Dutta, A. H. Chowdhury · 6 citations
IEEE
```

Let's build the intelligent grid together.

I'm actively exploring Grid AI, ML Engineering, and Data Engineering roles at energy software companies, utilities, and grid tech startups. If your team sits at the intersection of power systems and data — let's talk.

Open to
Grid AI / ML Engineer roles
Energy Data Platform Engineer
Power Systems Software Engineer
Research collaborations in grid ML