Josiah Davis


A large organization wanted to conduct customer forecasting for individual customers’ behavior. This tool was motivated by the following applications:

  • Company planning and resource allocation
  • Measurement and evaluation of organizational initiatives
  • Customer care and outreach

As the lead data scientist on the project, I contributed several ideas that I iteratively tested and refined, combining techniques from classical forecasting, machine learning, and modern portfolio theory.

  • Individual schedule-effect: Customer-specific index that identified their daily and hourly schedule and adjusted their time-series
  • Supervised segmentation: I used classification and regression trees based off of demographic data and typical behavior to put customers into segments
  • Spline regression: I fit regression splines that regressed hourly covariates against customer behavior with stepwise generalized cross-validation using Multivariate Adaptive Regression Splines.
  • Increase at Risk: One of the key ideas of modern portfolio theory is value-at-risk. I adapted this idea to calculate the increase in customer behavior for each individual customer.
  • Model Evaluation: I evaluated the models under a variety of scenarios against a naive model to provide empirical evidence for the consistency of results.

I spent 95% of my time in R, using the following packages: purrr, tidyr, earth, rpart, randomForest, data.table, ggplot2.