The Z Files: The Problem With Early Projedictions

Todd Zola is resigned to having to generate next year's projections far earlier than he'd like, but that won't stop him from digging into the factors that fueled Scooter Gennett's power surge.

Updated on October 12, 2017 8:43PM EST

The Z Files: The Problem With Early Projedictions

By:

Todd Zola

Updated on October 12, 2017 8:43PM EST

The Z Files

I blame the National Fantasy Baseball Championship (NFBC). There was a time I launched projections in February. Now, I need to have the initial set ready by Nov. 1 to get out in front of the NFBC Draft Championship competition, a 15-team, 50-round, draft-and-hold format kicking off in November.

The earlier rollout presents a conundrum, because, quite frankly, there isn't enough time in a month to do a deep dive on over a thousand players. Okay, no one dissects that many players, but looking in depth at the top 250 or so isn't plausible in October in conjunction with early roster and playing time estimates. I try to incorporate as many objective factors as I can into my system, but a combination of emerging data and change in philosophy has added more subjectivity to my process than past seasons.

PROJECTION VS. PREDICTION

For many, these words are synonymous. I feel there's a subtle, yet significant distinction. A projection is a prediction, but a prediction isn't always a projection.

A projection is numerically based. It's, "An estimate of future possibilities based on a current trend." In terms of fantasy baseball, it's using back-tested algorithms to estimate future results based on past performance. It's looking at aging and fleshing out luck versus skill. There are several other aspects, with the common denominator being an unbiased, formulaic approach.

A prediction incorporates more abstract elements. Its definition is, "Declaration or indication in advance based on observation, experience, or scientific reason." It's the first two

PROJECTION VS. PREDICTION

For many, these words are synonymous. I feel there's a subtle, yet significant distinction. A projection is a prediction, but a prediction isn't always a projection.

A prediction incorporates more abstract elements. Its definition is, "Declaration or indication in advance based on observation, experience, or scientific reason." It's the first two that delineate a prediction from a projection. Some examples could be specific player splits, faster maturation of a young player or belief in a change in approach. There are others, but the glue is factors that aren't directly programmed into the engine.

Perhaps I'm forcing definitions to suit my narrative, but I see a projection as a prediction with scientific reasoning. For years, I stubbornly eschewed including experience and observation in my model. This has waned greatly over recent seasons as I've often overridden Excel, miraculously being spared a lightning strike as I depress the Enter key.

There are some professing they don't use projections. By my definitions, they likely don't. However, whether they admit it or not, they use predictions. How we feel about player performance is a prediction. I suppose I don't use projections either, since I add in observations and experience. But I'm adamant the foundation should be a projection. You could say I generate and use projedictions.

The point being, 30 days after the regular season ends isn't sufficient to projedict the player pool. To be honest, I'm not sure there's ample time before Opening Day. That said, the NFBC isn't going to wait for me to dissect 700-plus players, so once I get finished typing this, I'll return to adjusting projected strikeout rates in accord with swinging strike rates.

There are a couple of research studies I want to undertake which could influence my projedictions. Unfortunately, neither will be finished before Nov. 1, which means my initial set will be subject to more change than normal. Those using them in a November draft will be frustrated when I alter some players come December or January, but hopefully they'll understand the process and not follow them blindly. They'll do their own research, landing on their own projedictions.

MORE POWER, MORE STRIKEOUTS

The statistical landscape is always changing. League-wide totals are never the same year to year. If something other than variance is driving the difference, the change is usually gradual. This hasn't been the case the past two campaigns.

The heart of a projection is a three-year average, though some models utilize more seasons. Most, if not all, use a weighted-average with the more recent data carrying more significance. This helps account for the current skill level, as well as current environment being most relevant. There isn't a standard weighting, but most assign at least half of the percentage to the most recent data.

There's elegance in the simplicity. If a stat is trending up or down, the chance it reverses is baked into the weighted average.

Everyone knows of the recent trends, but how do they affect projedictions? Many have expressed concern over setting 2018 category targets. This isn't my focus as I oppose target drafting. I'm curious if we're still going to see more homers and more whiffs, a leveling off, or a reduction. More importantly, if either grows or lessens, will the distribution be uniform or will some players be influenced more than others?

The chance a trend reverses must always be accounted for in the foundation, so a weighted three-year average is obligatory. However, the coefficients aren't set in stone. The weightings are based on back-testing data. If the landscape isn't the same as the testing, the results may not translate to the present. As such, if I determine power will continue to swell, the coefficient for the most recent season should reflect that, allocating even more than the 50 percent weight of my present model. Similarly, if I sense a decline, I may opt to lower the weighting.

To make this deduction, the reasons for the elevated power and strikeouts must be unearthed. We all know the possible explanations, ranging from steroids sneaking through the testing process to a different ball to uppercut swings.

Perhaps I'm too utopian, but I'm discounting the chemical advantages. Maybe it's because I don't want to believe, but I just have difficulty presuming players are ahead of testers and/or MLB is again choosing to ignore it if they are.

That leaves the ball and approach. I'm of the mind the two are intertwined. The reason players are altering their swing plane is the springier ball is further rewarding the change. Fly balls not leaving the yard are generally a bad thing. Sure, some result in doubles or triples, but more grounders get through the infield than fly balls land safely in the outfield.

As an aside, I don't believe MLB sent out a double-secret directive to change the ball. This could have happened organically. Right around the time of the spike, the factory supplying the balls changed. Different equipment, different workers, etc., could have easily produced a ball still meeting MLB specifications, but having different characteristics.

Although it should be intuitively obvious lofting more batted balls render more homers, Statcast provides a parachute of sorts. If I swing harder, with more of an uppercut, I have a better chance of hitting a homer. Now that exit velocity and launch angle have been added to the vernacular, players are more confident. I know this seems weird. There's likely a clinical term, some type of cognitive bias, to describe it. For me, it's akin to trying to lose weight. You work out, you eat better. But, until you step on the scale and see the number dropping, there's a mental barrier against seeing what you're doing is working. And, once you tangibly see the results, you're motivated to keep going.

Folded into this is there's no longer shame in striking out. Not only are players swinging from their heels early in the count, they're not changing their approach with two strikes, continuing to swing for the fences. If they whiff, it's because so many relievers are throwing more heat or umpires have no clue. There's a built-in excuse. Batters no longer take the walk of shame back to the dugout, they're greeted with, "Nice cut, you'll get him next time."

Obviously, I could be wrong, but it seems to me a harder, uppercut swing, with the ball jumping off seemingly less powerful bats without the embarrassment of missing, has fueled the power increase. Okay, not everyone is doing this, but enough are to make a difference.

From a projection perspective, it's the distribution of the added homers I care about, for both hitters and pitchers. Were Scooter Gennett's 27 long balls more influenced by a juiced ball than the 59 clubbed by Giancarlo Stanton? Are there pitchers more susceptible to the change in approach than others?

Sadly, there isn't enough time to crunch the numbers to glean this information by Nov. 1. Whether the answers are more objective, thus suited for my projection engine, or subjective on a player-by-player basis remains to be seen. I am, however, looking forward to digging into this.

INCREASED INJURIES

While I don't feel the frequency of DL stints will relate directly to projedictions, they certainly will shape draft strategy and team construction. Some are blaming more injuries on the 10-day DL. OK, maybe that accounts for some of the increase. Initial research suggests there's more to it; players just got hurt more often in 2017.

My plan is to categorize the injuries, then compare to previous seasons. The hope is to uncover something to help discern if this season was a fluke, or if we should prepare for the same or even a greater influx of DL visits next summer.

Perhaps this should be part of a plan regardless, but having more multiple-eligibility players helps combat the roster holes derived from losing hitters to injury. The question is how much are players such as Jose Ramirez, Cody Bellinger and Alex Bregman boosted in rankings? How high should their bids increase in an auction? It's not just the front-end players, but also the likes of Ryon Healy, Brandon Phillips and Asdrubal Cabrera. Roster flexibility can come from anywhere.

With respect to pitching, for several seasons, ace starters were one of the more reliable subsets of the entire inventory, not just pitching. On the surface, this wasn't the case in 2017. What's the way to deal with this if stud arms aren't as reliable, as a group? Is it more imperative to grab Max Scherzer, Chris Sale or Corey Kluber? What's better, a starter called up as an injury replacement or a high-strikeout middle reliever?

Again, the injury research won't matter with projedictions. However, not only am I launching projedictions on Nov. 1, I'll also be expected to provide drafting advice, not to mention drafting my own teams, be it early leagues or industry mocks. Some of my recommendations, not to mention decisions, will be blind to what I ultimately learn.

WRAPPING IT UP

Reading between the lines, I anticipate seasoning my initial projections more than ever this season, and there will be more projedictions than before. Even with that concession, I still want to be rooted in science. There won't be any changes "just because." There will be a reason. Does this rule out some gut feel? Eh, no, not entirely. There won't be enough to satisfy those saying, "you never go out on a limb", or "all you do is project something in the middle", but I'm comfortable when I do override the engine, they'll be no life-altering repercussions.