Image credit: © Matt Marton-Imagn Images
System updates this year were modest: on top of last year’s accuracy jump of hundreds of runs in predicting pitchers, we checked to see if changing from our StuffPro metrics to our new ArsenalPro metrics would offer any further improvements: it did not. This isn’t surprising, as Arsenal metrics incorporate context, and the aggregate context of any pitcher’s season is hard to anticipate. So, pitcher projections will stick with last year’s approach.
Batter projections incorporated a modest improvement to our home run forecasts but otherwise are unchanged.
Considering the bigger picture going forward, two ongoing challenges jump out.
The first is making sense of minor-league performances at the Double-A and High-A minor-league levels. Both levels are incredibly important in separating out true major-league talents, but the data publicly available from them remains limited. We certainly appreciate the public access to play-by-play and pitch result data (swing, take, etc.), but the gulf between the quality of data for these levels and the Triple-A and Low-A levels that sandwich them is staggering.
The Triple-A and Low-A levels offer us not only data on the individual pitches being made, but also Statcast measurements of batted balls. None of this is publicly available for Double-A or High-A batters or pitchers, which puts public analysts at an enormous disadvantage as compared to teams and their vendors. Although there is a robust trade in off-the-books exchanges of such information, particularly among the scouting community, this information needs to be standardized and made public, with MLB’s seal of quality approval.
MLB recently decided to harmonize and limit the technology used by major-league clubs across their minor-league affiliates. Although this move is controversial, it hopefully will also enable equivalent Statcast and pitch data to finally become publicly available for Double-A and High-A. Here’s to hoping this happens soon.
The second challenge is not solvable, and stems from basic differences in MLB economics. You don’t have to look hard at the teams on which projection systems were more accurate (such as the Phillies) and those that shocked those systems (the Brewers) to see what makes these distinguishes these organizations: clubs featuring older, expensive players, typically signed to long-term contracts, are easier to forecast than teams who eschew long-term commitments and favor players with limited and/or negative-quality major-league histories.
It is much easier to forecast established players than fledgling ones, especially with the data challenges described above. Teams that spend heavily on free agents, typically in large markets, not only feature players with long track records but often tend to give smaller markets the courtesy of the sunk-cost fallacy: even if a signing was a mistake, having consumed eight or nine figures of an owner’s money, that player is going to be playing whether you like it or not. And while such teams present unique collapse (and of course injury) risks, the collapses are unlikely to strike the entire team (although of course, sometimes they do), which makes our job easier (and, in fairness, also gives these teams a solid likelihood of success).
Smaller markets extend the projection community none of these courtesies. By rarely signing players to long-term contracts, small-market clubs make few signings they regret and are not obligated to any player once a better option becomes available. Failing players are easily benched for promising minor league talents, with teams having far more information about those players than the rest of us. If a small-market team has a terrible farm system, this is not a big deal, because it makes little difference which replacement player is doing a poor job. But small markets with high quality farm systems have plenty of options, and over the course of a major-league season, they may try out many of them, because they can.
At Baseball Prospectus, we are fond of saying that projection accuracy is driven primarily by playing time, not player measurement. When certain clubs become unpredictable on playing time, it becomes doubly difficult to forecast their results, because we now have two problems: (1) players with limited track records who (2) could play most of the season or very little at all.
No one is shedding a tear for the projection community, and that includes us: contrary to popular belief, we do watch the games ourselves, and we enjoy surprises as much as the next fan. But as our minor-league data sources improve, we will look forward to having a better idea what to expect from unexpected players, even if we are inevitably still surprised by the contents of team lineup cards.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
























