Random Forest Confidence and Prediction Intervals

Project

Random forests are known to provide excellent point predictions in a variety of contexts, but methods for presenting accurate assessments of the uncertainty associated with random forest predictions are lacking.  Baker Center personnel (Zimmerman, Nordman, and Nettleton) are working to develop a new approach for constructing intervals that accurately reflect the uncertainty of random forests predictions.  A random forest confidence interval provides a range of values that contains (with a specified probability) the mean of the response variable for given values of the predictor variables.  Likewise, a random forest prediction interval provides a range of values that covers (with a specified probability) the response variable value associated with given values of the predictor variables.  Preliminary work shows that these intervals exhibit the advertised coverage probability and are narrower (more precise) than those produced by competing methods.