# The Z Files: Are All Fly Balls Created Equal?

Let's jump ahead to the end. I don't have the answer. Someone with access to the data, with a deeper understanding of statistics will have to unearth the answer. However, I do have a question:

Are all fly balls created equal?

Remember when Voros McCracken first unveiled DIPS? For those not familiar, DIPS stands for Defense Independent Pitching Statistics, where BABIP (batting average on balls in play) was born. The initial notion was that the result of batted balls in play was out of the pitcher's control. Then, it was suggested BABIP is a function of line drive rate, following this archaic formula: BABIP = line drive rate + .12. So, a 20 percent line drive rate equated to a BABIP of .320 (.200 + .12).

Obviously, that's since been abridged, first to include grounders, fly balls and eventually pop-ups. A subjective determination of how hard the ball was struck was added to the mix, which is now being determined using Statcast data, namely exit velocity and launch angle. Quality of defense as well as shift data also joined the BABIP party.

We've come a long way since the days when we thought it didn't matter if it was Pedro Martinez or Pedro Astacio delivering the pitch; once contact is made, the outcome is left to fate.

Like BABIP, home run per fly ball (HR/FB) was assumed to be out of the control of the pitcher, save for the park factors. That is, the much like assumption was each pitcher's

Let's jump ahead to the end. I don't have the answer. Someone with access to the data, with a deeper understanding of statistics will have to unearth the answer. However, I do have a question:

Are all fly balls created equal?

Remember when Voros McCracken first unveiled DIPS? For those not familiar, DIPS stands for Defense Independent Pitching Statistics, where BABIP (batting average on balls in play) was born. The initial notion was that the result of batted balls in play was out of the pitcher's control. Then, it was suggested BABIP is a function of line drive rate, following this archaic formula: BABIP = line drive rate + .12. So, a 20 percent line drive rate equated to a BABIP of .320 (.200 + .12).

Obviously, that's since been abridged, first to include grounders, fly balls and eventually pop-ups. A subjective determination of how hard the ball was struck was added to the mix, which is now being determined using Statcast data, namely exit velocity and launch angle. Quality of defense as well as shift data also joined the BABIP party.

We've come a long way since the days when we thought it didn't matter if it was Pedro Martinez or Pedro Astacio delivering the pitch; once contact is made, the outcome is left to fate.

Like BABIP, home run per fly ball (HR/FB) was assumed to be out of the control of the pitcher, save for the park factors. That is, the much like assumption was each pitcher's BABIP would regress to about .300, every hurler's HR/FB would regress to 1.0 (now 1.1) plus a home park adjustment. The best practical illustration of this is FIP doesn't regress home run rate while xFIP does.

I'm beginning to wonder if there are different kinds of flyballs, some more likely to be homers than others. If this is the case, it follows that pitchers yielding more of the fly balls that leave the yard will possess a higher baseline HR/FB. This is somewhat akin to groundballs having a better chance of becoming a hit than fly balls, therefore the BABIP of a groundball pitcher is organically higher than that of a flyball one.

The bottom line is, I'm not convinced those of us regressing HR/FB to park-adjusted league average should be doing this for every pitcher. Just as we learned all pitchers don't regress to the same BABIP, the supposition is not all pitchers regress to the same HR/FB.

Let's step away from fantasy baseball and think about why a batted ball becomes a fly ball. One way is the hitter makes contact with the lower half of the ball. Another is solid contact is made, but with an uppercut stroke, elevating the trajectory. What have we been hearing about so much lately? Hitters are consciously altering their swing path to add loft to the contact.

Now let's take this a step further. Conventionally, a pitcher working up in the zone is usually a flyball guy while one that keeps the ball down induces grounders, since hitters top the ball, in part due to low thrown balls having more sink. This isn't an earth-shattering contention, but more pitches down in the zone are being elevated. While the data doesn't show this directly, fly balls were up last season and are up even more in 2017. The rate was higher than it is now six or seven years ago, but changes to the strike zone have forced pitchers to throw pitchers lower in the zone. My hypothesis is hurlers are still keeping the ball down, but hitters are doing a better job elevating these pitches.

The final piece to this hypothetical puzzle is it makes intuitive sense fly balls resulting from solid contact on an uppercut swing have a better chance of clearing the fence than one emanating from hitting the lower half of the ball on a higher pitch. There's a little physics involved, since the maximum energy transferred to the horsehide occurs when the swing is on the same plane as the ball's movement, which is of course down. Not only are the uppercut swings harder, they're creating even more exit velocity since the plane is in better sync with the ball's direction.

Turning back to fantasy baseball, there's a subset of pitchers causing me fits when I do the weekly pitching rankings. There's a number of hurlers with normal strikeout and walk rates, but bloated home run numbers. My model sees this and assumes a regressed HR/FB, rendering a guy sporting much better ratios than he's currently recording. He gets ranked accordingly, and I get torn a new one in the comments. Let's put three of them under the microscope.

Rick Porcello, Boston Red Sox

Here's Porcello's current and two-year stat profile:

 Season K% BB% HR/FB GB/FB 2015 20.2 5.2 14.5 1.40 2016 21.2 3.6 9.3 1.13 2017 21.2 4.4 12.1 0.92

Walks are up a little over last season, but still quite low (1.77 BB/9 for those not in tune with using percent). He's giving up more fly balls, along with an elevated HR/FB, at least compared to last season. The combination of more flies and a greater percentage leaving the yard render an unsightly 1.5 HR/9.

I'm curious about the declining fly ball rate. While his pitch mix is a bit different, my focus is on location. Are the increased fly balls a result of getting the ball up in the zone or are hitters elevating his pitches low in the zone?

Among the wonderful tools at Brooks Baseball, each pitcher has a Zone Profile showing the location of their pitches. There are 25 quadrants. Picture the strike zone as a tic-tac-toe board with nine squares. Add in high, low, then missing to either side of the plate to create a 5x5 grid and you can see where the 25 comes from. I didn't worry about the horizontal location. I want to know the vertical location.

 Location 2017 2016 2015 High 16.6% 16.0% 14.4% Upper third 19.0% 18.9% 16.6% Middle third 24.1% 24.8% 21.4% Lower third 19.1% 20.1% 22.4% Low 21.2% 20.3% 25.2%

Granted, this should be looked at on a more granular level, considering balls versus strikes and pitch selection, but on the surface Porcello's location is the same as last season, when his homers were much lower. That is, he's not leaving the ball up in the zone. By extension, that means hitters are better elevating his low tosses since the first table demonstrates he's surrendering more fly balls.

If my thinking is sound, not only are there more fly balls, they're being struck harder via an uppercut swing on a similar plane as the ball's flight. The bottom line is I'm not so sure I should be regressing Porcello's HR/FB to initial expectations. Chances are, he's not as sharp, but he also may be a victim of the growing penchant to add loft to batted balls.

Masahiro Tanaka, New York Yankees

Here's Tanaka's current and two-year stat profile:

 Season K% BB% HR/FB GB/FB 2015 22.8 4.4 16.9 1.4 2016 20.5 4.5 12.0 1.6 2017 20.6 6.2 22.8 1.4

And now his pitch locations over those three years:

 Location 2017 2016 2015 High 3.5% 4.9% 5.0% Upper third 10.8% 11.2% 10.0% Middle third 17.6% 21.4% 18.7% Lower third 25.4% 26.0% 23.8% Low 42.7% 36.6% 42.4%

The increase in walk rate suggests Tanaka may not be locating as well as last season, which could also result in more meatballs, hence the big spike in homers. He's also serving up more fly balls, yet is throwing more pitches in the lower quadrants than last season, again hinting batters may be elevating his tosses with more authority than in past campaigns. If so, can he adjust? Perhaps, but the point is I probably shouldn't be regressing his 2.2 HR/9 as much as I am.

Before we move onto the last specimen, if you're wondering about Tanaka's splits, his home HR/9 is 2.7 as compared to a road 1.9 mark. So yeah, Yankee Stadium predictable hurts, but his away numbers are lousy as well.

John Lackey, Chicago Cubs

 Season K% BB% HR/FB GB/FB 2015 19.5 5.9 9.8 1.38 2016 24.1 7.1 12.9 1.13 2017 22.1 7.0 21.8 1.11

 Location 2017 2016 2015 High 8.7% 8.3% 8.1% Upper third 14.2% 13.9% 18.8% Middle third 24.0% 22.7% 27.0% Lower third 24.1% 23.3% 24.8% Low 28.9% 31.8% 21.3%

Lackey's strikeouts are down in terms of K%, but his K/9 is better. I'll save that discussion for another time. More importantly, he follows the same trend as Porcello and Lackey with a consistent number of balls thrown in the lower quadrants, but a rising fly ball pace and a bloated home run clip. Like the others, until he shows he can do a better job of keeping the ball in the yard, I can't rely on regression.

To be clear, previous to this season, accounting for regression in HR/FB has served me very well, capturing pitchers others miss just by looking at surface stats. In addition, I am referring to regression in the classical statistical sense and not as a synonym for "play worse", as is the case with many of my brethren. For me, regression refers to something out of the pitcher's control, so the answer to "regress to what?" is the league mean, adjusted for park factor.

As hinted throughout this discussion, I can poke a plethora of holes to the process. The presentation is admittedly an oversimplification. Just looking at location without other considerations isn't enough. Perhaps an increase in fly balls is due to better contact on elevated pitches. Keeping the ball down is one thing. What was the velocity and pitch type? I could go on.

The purpose of this, beyond putting a thought out there, is to give you a glimpse of what goes on in my head. I think it's important to be transparent when doing what I do, not to mention it hopefully shows I take the comments in the various postings seriously.

Maybe this is a wacky season where selected pitchers have been extremely unlucky and are due for a correction of monumental proportions. Or perhaps we, as analysts, need to consider if the current landscape has rendered some of the principles we employ outdated. My money is on the latter.