Sabermetrics - Statistical Modeling of Run Creation and Prevention in Baseball Thesis

(2018). Sabermetrics - Statistical Modeling of Run Creation and Prevention in Baseball . 10.25148/etd.FIDC006540

thesis or dissertation chair

authors

  • Chernoff, Parker

abstract

  • The focus of this thesis was to investigate which baseball metrics are most conducive to run creation and prevention. Stepwise regression and Liu estimation were used to formulate two models for the dependent variables and also used for cross validation. Finally, the predicted values were fed into the Pythagorean Expectation formula to predict a team’s most important goal: winning.

    Each model fit strongly and collinearity amongst offensive predictors was considered using variance inflation factors. Hits, walks, and home runs allowed, infield putouts, errors, defense-independent earned run average ratio, defensive efficiency ratio, saves, runners left on base, shutouts, and walks per nine innings were significant defensive predictors. Doubles, home runs, walks, batting average, and runners left on base were significant offensive regressors. Both models produced error rates below 3% for run prediction and together they did an excellent job of estimating a team’s per-season win ratio.

publication date

  • March 30, 2018

keywords

  • Baseball
  • Defense
  • Estimation
  • Offense
  • Regression
  • Runs
  • Sabermetrics
  • Sports
  • Statistics

Digital Object Identifier (DOI)