Education

KU Leuven (Belgium)

Master of Science, Statistics • 2017 — 2019

Completed an interdisciplinary program, accredited by the Royal Statistics Society (UK) with Cum Laude distinction.

Master's Thesis : Reliability Analysis of Mechanical Equipment in a Cement Production Plant • 2018 — 2019

  • Developed an integrated predictive model (accounting for event history, condition monitoring, and production throughput) used it to identify the reliability of the mechanical equipment, and made maintenance-specific recommendations to improve reliability.
  • Used R and Python, to perform extensive preprocessing, cleaning, transformation, and integration of 3 years of stoppage records, vibration measurements, and monthly production totals coming from flat files of various formats (HTML, Excel).
  • Performed extensive feature extraction including text mining, natural language processing, and part-of- speech tagging to identify the failure mechanism, maintenance action, and repair status for each failure event.
  • Performed a criticality analysis (identifying cement mills and fans as the most critical equipment), disproved presence of trend in inter-failure durations, and identified the potential for event clustering.
  • Estimated semi-parametric (stratified extended Cox) and fully parametric (accelerated failure time) models and determined that decreasing production load and replacing (compared to repairing) broken components significantly decreased the risk of failures in cement mills and fans.

University of Michigan

Bachelor of Science, Informatics • 2009 — 2014

Through the Informatics program, I explored a cross-disciplinary approach to the intersection of technology and human interaction. I learned concepts including complex networks (technical and relational), statistics, mathematics, data analysis, usability, and object-oriented programming.

Experience

Freelance Data Scientist • 2019 — Present

  • I provide statistical consultations, develop data science tools, and various analytical products according to client needs.

ADP

Infrastructure and Operations Analyst • 2014 — 2017

  • Remotely managed all enterprise-level storage arrays and Fibre Channel infrastructure within two Tier-4 data centers, supporting all of ADP’s internal and externally hosted suite of products.
  • Coordinated with vendors to troubleshoot issues and ensure storage hardware/firmware is supported in accordance with established maintenance cycles, repair practices, and change control.
  • Collaborated with Service Management to design and implement purpose-driven performance and capacity reports in order to protect Service Level Agreements, and prevent outages of business critical and clientfacing applications.

ADP

Enterprise Storage Intern • 2013 — 2014

  • Developed internal performance monitoring application by aggregating data streams from 37 proprietary Network Attached Storage (NAS) arrays across multiple data centers, performing statistical analysis in R, and distributing via an interactive web interface.
  • Created unique performance profiles detailing latency, CPU utilization, and network throughput of ADP products by aggregating data from the underlying storage arrays dispersed across multiple data centers.
  • Collected performance and capacity data from NAS and SAN storage arrays for long-term benchmarking, historical investigation, and real-time management.

Skills

International experience

Between my work experience interfacing with international counterparts, pursuing graduate education and abroad, and maintaing a personal and working life all across the globe, I am able to work and thrive in any locale.

Data Science

Through the work of my masters program I have gained extensive experience in the following areas.

  • Machine Learning
  • Multivariate Analysis
  • Mixed Models
  • Experimental Design
  • Generalized Linear Models
  • Survival Analysis
  • Time Series Analysis
  • Optimization
  • Artificial Neural Networks
  • Bayesian Analysis
  • Natural Language Processing
  • Image Recognition

Development

I perform a significant amount of development work in order to build and deploy data science solutions, using some of the following frameworks.

  • Python
  • R
  • TensorFlow
  • Docker
  • Shiny
  • Excel
  • Markdown
  • YAML
  • HTML
  • Latex
  • Git
  • API
  • SQL