Why College Football Rankings Are Terrible

The only constant with the college football rankings is that someone always believe they are terrible. Every Sunday (and now Tuesday), there is some argument with the rankings. The rankings are useless in today’s game with the selection committee being the final source of truth, but are they. The initial rankings demonstrated that the public perception (as reflected in rankings) plays a significant role in the rankings that the committee put out.

Currently there are two major types of polls, human and computers. Human polls involve a set of people who are asked to fill in their top 25 teams each week, then these are compiled and a top 25 is made on the composite. Computer polls are mathematical formulas that rank all teams based on their on field performance using statistical weightings and algorithms.

Both of these sets of polls have the same problem though, something weighted limits the ability of the polls to be unbiased. However, the biggest problem is the sample size is terrible. There is no way to truly rank teams based on the limited stats without being willing to try something outside of the norm. However, all of these polls try to rank these teams in the same way that we estimate ranking for NFL teams (with a much better sampling) or college basketball. So here’s a breakdown of the major problems with the polls and how the polls are naturally going to have some sort of bias (the statistical sort).

Arbitrary Tiers

So we have established some sort of arbitrary tiers for the programs. In the FBS we have established a line of demarcation between P5 schools and the schools in the group of five. This aligns team to conferences as if the conference is really about athletic accomplishment and not about revenue and generating money. Then, you have the FCS schools which have their own set of sub-tiers. This is all just Division 1. Then, public perception has it’s own ranking of the conferences within the P5 and group of five. This all creates an insurmountable bias before the first ball is passed. These tiers are the first step in skewing everything about football because suddenly there is a gap between any two teams on day one.

Bad Sample Size

There are currently 128 FBS teams, ten conferences, and four teams that have no conference affiliation. Of these four other teams, Notre Dame has a minor affiliation with the ACC that involves playing five ACC schools each year (the ACC being one of the P5 conferences). So there are a lot of teams, so a lot of games to compare. Here’s the problem with that.

Each of the 128 teams play 12 regular season games (except teams that travel to Hawaii who can play 13, but we will leave that out as it’s small compared to the total sample). This gives us a total of 1536 regular season games. Each team (save the independents) play at least eight conference games (124 * 8 = 992 conference games). Notre Dame plays five “conference games” (992 + 5 = 997), each of the 12 teams in the Pac12 and 10 teams in the Big12 play nine conference games (997 + 12 + 10 = 1019).

So what this means is that 1019/1536 games are involved in conference cycles. That’s just over 66%. So for a comparative analysis of all teams, you have to remove the cycles. That leaves 517 games, or about 4 per team. It’s impossible to come up with any sort of valid conclusion to this, especially when you consider some of these games are against FCS or Division II schools. That sample is so small that even a dream situation of completely random distribution of those games would not establish much separation amongst the top 60+ teams.

Confirmation Bias

Human polls are obviously going to have confirmation bias. If someone believes that Team A is the best team in the nation, and then Team B wins head to head, obviously Team B is the best team in the nation and Team A certainly wasn’t overrated. This is just an obvious problem with human judgement in comparison. It’s also apparent when people tweak their computer ranking or algorithm that there was something that they didn’t feel was correct and they are “fixing” it.

The other problem is that many of these systems have ways to be beat. Most commonly, it’s better to beat a team you should beat than to compete against a team that you shouldn’t. For instance, many rankings will give Florida State much more credit for beating the Citadel than it will Arkansas for losing to Mississippi State by only 7. Winning is what matters most importantly, then it’s important that those that you play win. It becomes a necessity for neither you nor your opponents to schedule possible losses, because everyone benefits that way.

Also, when the computers consistently turn out the results that people want, they are used and are at their highest, but once they fall out of the graces of perception, people move on to the next hot ranking. I’ve seen people go from F+ to Massey to Sagarin to FPI and more over the course of a single season. You can always find an algorithm that will support your argument if it has any basis.

“Eye” Test

This is the most frustrating things about the polls and the computers. First off, computers can’t watch the game so suggesting that it can dole out the “eye” test is silly. Because computers can’t watch the game and interpret for itself, it must depend on the tweaking of an algorithm and data entry. This often becomes a point where someone tries to determine which stats are important and which are most important. As far as I know, none of these computers use an Artificial Neural Network so they are incapable of learning on their own from past experiences.

The pollsters can not watch all of the games, but are expected to know enough about them. Someone asked how I know they don’t watch all the games and it’s actually pretty easy. There are 15 weeks of football to cover the 1536 games mentioned above. Each of those games have two opponents so we must level that to 768 contests over the course of the season (assuming no one played out of the FBS). That leaves 51 games for any give week. If you assume a generous five weekday game, you have 46 games on a Saturday. The first game kicks off at 11a on Saturday (CST) and the polls are usually out by noon the next day, a difference of 25 hours. If you could speed the games to the speed you needed, a person would have to watch each game in 33 minutes, back to back, to finish in time to turn in their polls.

This means that many pollsters watch a subset (either regionally, traditional powerhouse, current polls) and then try to judge all of the teams based on the highlights and the few games they do watch. The “eye” test is the ESPN/FS1 test. It’s impossible to have a good set of information brought to the polls when you expect people to judge teams they haven’t even watched. Several times pollsters have commented on not watching teams in the top 10. Somehow they know what they are worth?

Preseason Polls

It’s crazy to even pretend to think we know what the season will look like before it unfolds. At the beginning of this year South Carolina was a top ten team and Mississippi State wasn’t even ranked. However, these polls serve as the basis of argument later in the season. When it comes to the end of the season, and the resumes are being matched up, it will be said that Mississippi State beat #6 Texas A&M who beat #9 South Carolina. It won’t matter that South Carolina isn’t playing in a bowl or that Texas A&M struggled with UL Monroe, because they were inflated early in the season. The preseason poll serves as a method for confirmation bias and weak arguments because of the uselessness of it not being exclusionary to the validity it’s given by “experts”.

This is why we are moving away from the polls. It have nothing to do with the format of the postseason and everything to do with the absolute lack of precision practiced by these methods. These things make for a really bad day in football and provide that these polls can not be accurate. Is it even possible for 5 or 6 of the top 15 teams in the nation to be from the same division of a single conference, or is it simple a product of the problems with the small dataset.

Really college football has more natural tiers than rankings. Rarely is the #4 team far and away better than the #12 team, nor is there a large gap between #1 and #5 (there are exemptions to this rule like 2013 FSU and 1995 Nebraska). But that’s why we love college football. It would really make more sense to group teams by some sort of criteria and then have inner tier criteria for ranking (like beating teams in certain tiers). However, scheduling and conferences are not about quality football or opponents and all about money. Network coverage and chest thumping is all about money as well. The polls are another way to drive revenue and money into the NCAA’s pocketbooks.

Sadly, we continue to use useless statistics as the basis of arguments as if they were facts.