Keith Goldner, chief analyst numberFire.com
In elementary school, we were the kids you didn't want to play in Connect Four. It just wasn't fun for you, unless your father was the GM of the Pirates and you masochistically came to enjoy losing. Yeah, that was us -- the one doing 60-problem Mad Minute Math exercises while you were struggling with 30. We are nerds, a proud people. We're even prouder today, because instead of applying our brains to Parker Brothers-related dominance, we're applying it to sports. We're using math, science, copious amounts of caffeine and the sweet sounds of vocal trance to project out future player, team and game performance.
Of course, dealing with math as much as we do, we know it's all about performance. It's about numbers. So, how did we do against the benchmark of benchmarks, the spread?
There are tons of different strategies to beating the spread. Most include at least some form of quantitative analysis -- even things as simple as analyzing the movement of the lines. We take quantitative analysis to the extreme, using data and data only to project outcomes. Sure, there are pros and cons to this strategy; we are devoid of any human emotion or bias, but we're also susceptible to the non-quantifiable realities of the game, such as injuries. To counteract this, we blend in our data-centric approach with other, more editorial methods of analysis to beat the spread consistently.
A little about our record:
Picking every game since 2007, we are 54.0 percent against the spread and 53.4 percent on over/unders. In 2012, we went 57.2 percent against the spread and 53.4 percent on over/unders. We made two picks a week on RotoWire.com and finished the year 21-12-1 (61.7%). Obviously two games a week is a small sample, but those were the ones about which our model felt the strongest.
How do we make our picks? numberFire's core algorithm involves a massive union of similarity scores. We look for similar players, playing on similar teams, against similar opponents and weight these similarities based on the strength of the relationship. Once we have the union of these three sets and each similarity's relative weight, we then use actual historical data to project future performance.
For example, if we want to know how Aaron Rodgers will perform against the Detroit Lions, we look at all similar QBs since 2000 in terms of raw efficiency and a slew of proprietary statistics that we calculate internally. These internal statistics are far more informative than traditional ones like yards or TDs.
We then look at which of those QBs played for teams that are similar to the current Packers team and played against teams similar to the Detroit Lions. We will end up with a huge array of similarities, each weighted accordingly based on the strength of that similarity. Then, we look at the actual performances of those similar QBs in those similar games to project Rodgers' production. To project games, we take the individual player out of it and just look at similar teams playing against similar opponents.
More important, however, is how we analyze these teams and players. In football, normal statistics, while widely used, can be extremely misleading. Take two separate third-down situations: 3rd-and-15, a running back runs for 10 yards, the team is forced to punt on 4th-and-5. Or, 3rd-and-1, a running back runs for two yards, converts a first down, and the drive continues. All we see in the statistics is 10 yards versus two yards, so the first running back looks better, when, in reality, the second running back made the bigger play.
Using our own internal metrics, we calculate efficiency on a situational and play-by-play basis. Over the course of a drive, game and season, we can more accurately gauge the true efficiency of players and teams. There is an old saying in computer science as well as in the test kitchens at Taco Bell: "Garbage in, garbage out." If we do not accurately evaluate players and teams, that means we are putting garbage into the algorithm. Thus, it is extremely important to use the most meaningful statistics that actually convey a team's strength or a player's ability.
Using these methods, in Week 10's edition of ESPN Insider's Playoff Predictor, we accurately projected the AFC North would send three teams to the playoffs. This was met with much hostility as it meant teams like the New York Jets, Tennessee Titans and Buffalo Bills would miss out come January. There was a lot of public emotional investment in the Jets and disapproval of the young Cincinnati Bengals, both of which were filtered out by quantitative analysis.
We are not perfect, however. Being in the business of trying to get things right a little over 50 percent of the time still means we are wrong more than 40 percent of the time. Not everything in football can be captured in the data. There are 22 players on the field at once, all with separate goals, creating a huge interaction effect in the numbers. Injuries are exceedingly difficult to predict; a change in team dynamic is hard to pick out, as we want to use as much data as possible to ensure big enough sample sizes. The New York Giants were a completely different team entering the playoffs last season than they were in the regular season, but from a data perspective, we could only go off the information from earlier in the season. Except for the first round, we picked against the Giants throughout the playoffs because the data said they were the same team that lost twice to the Washington Redskins and once to a dismal Seattle Seahawks team at home.
When it comes to projections, we look for even the smallest, most marginal edge. A one-percent edge does not mean anything comparably while making one pick, but over time, that one-percent adds up significantly. By using added insight from analytics, we gain that edge. Check us out at www.numberFire.com for more information and go to www.numberFire.com/RotoWire/ to be alerted when our picks come out for 2012!
numberFire is a sports analytics platform that uses algorithmic modeling to better understand sports. Follow Nik Bonaddio at @numberfire, and Keith Goldner at @drivebyfootball. Check out numberFire on Facebook at www.facebook.com/numberfire.