The Z Files: Just a Little Patience

Todd Zola cautions against the misuse of new-fangled stats.

Updated on April 27, 2017 8:21AM EST

By:

Updated on April 27, 2017 8:21AM EST

OK everyone, let's pump the brakes. New-fangled stats are great, but still more descriptive than predictive. Meanwhile players have started slowly for 141 years and ended up fine.

If you follow me on Twitter or Facebook, you may have seen that earlier this week. By the looks of those that liked or favorited the post, I'm not alone in my frustration.

Part of truly understanding a metric is not only knowing when to use it, but also when not to. You need to comprehend its limitations as well as applications.

Primarily a result of all the glorious Statcast data, there's a plethora of fantastic new tools at our disposal. However, we're barely out of the gestation, still in the embryonic stage of development. Presently, the data is excellent for reverse engineering what happened. In some cases, regression candidates can be identified if there's a disconnect between skills and outcomes.

Here's the problem. In this stat-based era, everyone wants to be the first to the finish line. Forget patience, Statcast can tell me who is fact and who is fluke in 20 games. By trying to be the smartest person in the room, too many are outsmarting themselves. Barrels, spin rate, exit velocity, tunneling, it doesn't matter. Knowing what a player is doing the last week in April isn't an indication -- yet -- of what will transpire over the next five-plus months.

Something I've talked about for several years is incorporating stability rates into rest-of-season player evaluation. Work done by

OK everyone, let's pump the brakes. New-fangled stats are great, but still more descriptive than predictive. Meanwhile players have started slowly for 141 years and ended up fine.

If you follow me on Twitter or Facebook, you may have seen that earlier this week. By the looks of those that liked or favorited the post, I'm not alone in my frustration.

Part of truly understanding a metric is not only knowing when to use it, but also when not to. You need to comprehend its limitations as well as applications.

Something I've talked about for several years is incorporating stability rates into rest-of-season player evaluation. Work done by Russell A. Carleton (aka Pizza Cutter) and my friend and colleague Derek Carty, amongst others, have provided elegant research on when metrics stabilize. A recent posting on Baseball Prospectus (subscription required) by Carleton takes me and a lot of folks to task for its misapplication.

Well, I don't know for sure I'm among the guilty parties. For what it's worth, the author includes himself in that group. The primary point in the piece is the misinterpretation of the term 'stability' and the misuse of the data in terms of evaluating future performance.

To be honest, I've always been uncomfortable with the label stability rate. Stabilize is to stringent a word. To explain that, I need to take a step back and talk about projections in a more general sense.

There's a lot of ways to look at a projection. Some think of it as a weighted average of plausible outcomes. Others see it as the most common result if the season is simulated a gazillion times. I like to distill performance down to skills and project what I call the 50/50 point. That is, there's a 50 percent chance the player ends up over or under the mark. Instead of considering stability rate, I look at it as the 50/50 point of some metrics moving sooner than others. Let's take another step back and talk about rest-of-season projections.

Hopefully, you understand the gambler's fallacy and how it applies to player performance. If a flipped coin lands heads four times in a row, there's still a 50/50 chance it's a head on the next flip. Over time, the split will regress towards 50/50, but the chance is equal each individual flip with be heads or tails.

A player's luck will not necessarily even out. Luck should always be considered neutral going forward.

Skills, on the other hand, don't even out, but they can balance. Over a small sample, a hitter can face a stretch of better or worse pitchers which is reflected on his skills over that span. Later in the season, this will reverse. In theory, this is captured by the stability point.

As alluded to, I project a player's skills then build out the numbers we see on the back of a baseball card. Let's use batter's contact rate as an example. My rest-of-season contact rate projection is a weighted average of what I initially expected and what the hitter has done to date. However, it's not a linear relationship. If my initial projection was 80 percent, and after 81 games the hitter sits at 84 percent, my projection isn't 82 percent.

The coefficients used are in accordance with the stability rates. Those that with faster stability rates pull in more season-to-date skills than those with slower stability rates. Admittedly, someone like Carleton, who has forgotten more about statistical analysis than I'll ever know, could refine the coefficients more than I.

That said, I'm confident my system does what's intended: after all, regression allows for some wiggle room. Contact rate has the earliest stability rate of the hitting metrics germane to projections so I move the rest-of-season 50/50 line for contact rate sooner, and regress it closer to the current level than other metrics.

The key is regardless of the new projected skill level, there's still a 50/50 chance the player ends up above or below the new mark. That doesn't prevent me from using this to make player decisions. For instance, I like to identify players whiffing at a lower or higher rate than expected early on. According to my philosophy, if you feel the improved player will return to career levels, I'm higher on him than you for the rest of the campaign. It really helps if there's a luck metric out of whack, clouding the analysis. Perhaps the hitter is not only fanning less, he's been lucky with hit rate. You're expecting regression, as am I. The difference is our respective landing points. When the batting average on balls in play normalizes, my expectation is more favorable since I plan on better contact.

Circling back to the opening salvo, what bugs me about my process is this. What if my initial expectation is way off? Eventually, the in-season performance will be suitably captured and reflected in the remaining expectation. But until then, I'm off. Hopefully, the new wave of stats will not only help refine stability points (become predictive) but also fine-tune the original expectations, as well as lend some assistance in adjusting the initial baseline to better reflect what the player's current still levels may be.

The gray area for me, with respect to pitching, is velocity. I honestly don't know what to do with hurlers exhibiting a velocity drop. Over time, the lost mph will result is skills degradation, which in turn is captured by the algorithm. The threat of injury aside, the original projection of a pitcher working with less velocity is likely optimistic. At some point, the more advanced metrics like spin rate and tunneling will help in this regard, but until then, all I can do is compare previous seasons' trends with respect to velocity and make subjective determinations based on the data.

We'll soon be at a point where I can begin highlighting hitters to target and others to avoid. That will be the focus of next week's Z Files.