Column: Analytical Failures in the 2012 Election Results with Implications for Other Topics of Importance

Thoughts from a – hopefully – Enlightened Skeptic
By Ralph E. Chapman

To listen to the pundits before the election you would think that anything could have happened last Tuesday when we all went to the polls.  

The mainstream media was emphasizing just how close the election was going to be and that Mr. Obama was potentially in grave danger of becoming a one-term president.

If you listened to the conservative pundits you were assured that good, as they define it, would win out over evil; some suggested it would be a rout with Gov. Romney getting over 300 electoral votes.

As an aside, I find it interesting that their implication was that this outcome would provide a clear mandate for their policies but I now hear from them that the opposite is not true because it was a “close” election.

A quick note about my views, as all comments are meaningless without the context of the person who makes them. I am a paleontologist and technologist who specializes in the analysis of data, especially that encountered while doing research in the sciences within the umbrella of natural history.

The data encountered within natural history is generally very difficult to deal with, especially the historical data found in paleontology, anthropology, archaeology and geology. It is often very noisy and variable and you have to deal with the complexities introduced by time and a constantly evolving environment.

I did this for the Smithsonian for almost 20 years and, subsequently, taught graduate classes at Idaho State University in the use of these methods for just under five years. I’m comfortable with historical data and find the challenge of dealing with it to be something that gives me great satisfaction.

Politically, I am a true centrist. I grew up a New England Republican, something that is apparently extinct now ( ironic given that I am a paleontologist.) When I vote, I generally split my ticket and go for the people I think will do the best job. I did this on Tuesday.

Now onto the election. The data you get from polls is, in many ways, very reminiscent of paleontological data. When  you start an analysis you have to understand that the data are, inevitably, biased and imperfectly collected from a constantly changing time series.

Every political poll is imperfect in its own way. Most polls are heavily biased towards calls to telephone land lines, for example, rather than  cell phones. I think anyone with a cell phone is happy not to be bothered but it does bias the sampling a good deal – often towards older people, those who stay at home during the day and early evening, and those less interested in technology.

If one candidate is more heavily supported by these people, he or she will get an overly high representation in the results. All polls are biased towards those who are willing to answer the questions in the first place and, I suspect, we all can come up with ideas on what kinds of bias can come in from that type of filter.

Busy people tend not to have time to stay on the line. Old New Englanders typically don’t want to tell their business to strangers over the phone.

So, the relevant poll data are biased. How do we minimize the effect of that? Well, the first way to do it is to use data from lots of polls with the hope that the biases will, at least partly, cancel out.

This is where the media tend to fail miserably. How many times did you hear from a news outlet this year something like this (I just made this one up, you can change the position of the names) “Well, Romney is now 2 points ahead as a new poll shows he has 51 percent to Obama’s 49 percent.”

They often do this even if 12 other polls taken at the same time showed the opposite numbers. The point is not that you don’t show the results of this one poll, but that you show all the available data and that is what data savvy people look for and what competent journalists do.

You have to look at poll data from multiple sources as a distribution and draw conclusions, in part, from the whole picture and not by cherry-picking one poll that gives the answer you are looking for. To do otherwise is to over-emphasize that one poll and, if done consciously, shows evidence of ineptitude and/or dishonesty.

The next component to understanding the polls is to realize they provide historical data – data taken through time. Each poll, with its inherent biases, is a time-series of data taken within a consistent methodology.

Time series data have a trajectory and in that trajectory there is understanding that can be gleaned. To give a paleontological example, let’s say a group of animals went extinct at time Y. The interpretation of the cause of the extinction will vary greatly with whether the group was expanding up until time Y, or was in decline before it.

The former would suggest something more catastrophic may have happened, the latter perhaps that a longer-term cause was potentially involved. As an aside, the scientist in me hates simplistic examples like this and I can come up with lots of arguments of many other options in addition to what I just said.

But you get the point. So polling data has this time trajectory and it makes available to those trying to interpret those numbers an additional dimension to understanding what has and what is going to happen. I have always tried to use this dimension through the years and it has seldom failed me; I’ve generally known what was going to happen before the voting, even in close races.

When I was trolling the web to find an easy way to get these data for this year’s election, I happened to run across Nate Silver’s column from the New York Times entitled Five Thirty-Eight: Nate Silver’s Political Calculus. Here, he published almost daily all the data I was looking for and I’ve been able to evaluate it to my heart’s content.

Just as wonderful, he provided his interpretation of the data using multiple polls and their time trajectories and that gave me a dialectic for maturing my interpretation even more. I didn’t always agree with what he said, but the data were there.

Reading successive columns even gave me a longer time fetch to interpret the data. There were many comments made by people disagreeing with his interpretations and even, at times, vilifying him for his biases (biases that we all have.)

Nate Silver really didn’t play the political game, however, because he presented all his data and let others point out where they thought he was wrong. They never seemed to be able to make much of a strong argument against what he said, and for good reason. As another aside, I think Nate Silver would have made a good paleontologist.

So late afternoon Tuesday, I already knew what was going to happen because the data were incredible strong and only massive election fraud would derail things. I told my lovely bride that Mr. Obama was going to get a minimum of between 305 and 310 electoral votes but personally figured there was about a 70 percent chance he would be in the 330’s depending on what the enigma of Florida did.

When the news shows started, I bounced around the channels and Internet to see what was being said and was amazed at what I was hearing. Conservative pundits were saying things that made absolutely no sense given the data. I would have expected them to at least prepare their listeners by hinting that, at best, it would be a real tight race.

Instead, they set their people up for a big fall. I have no problems with the mainstream media being very conservative before Tuesday night as to not affect the election by making it easy for people to justify staying home and not making the effort to vote (anyone who has watched Dancing With The Stars knows the cost of complacency as undeserving dancers have been sent home.)

However, their comments were always overly sensationalistic about the closeness and it was clear they were mostly trying to keep viewership up. I guess that is the nature of the beast.

The problem is the failure of the conservative pundits. Some of the political implications of this have already been discussed in post-election analyses ( see, for example, websites such as http://www.theatlantic.com/politics/archive/2012/11/how-conservative-media-lost-to-the-msm-and-failed-the-rank-and-file/264855/).

The conservative outlets and their pundits failed their viewers, plain and simple. It is important to understand, by the way, that had the data been the same except with the trends going for Gov. Romney there is no doubt that Nate Silver would have interpreted it that way. I would have as well as I respect data; I’ve analyzed too much of it not to.

However, there is another implication that comes from this that is very important. As I said, these pundits were incredibly off the mark, including George Will and some others who should have known better.

To be that far off has two potential causes. First, they and their staffs may just be awful at looking at such data and subconsciously they just saw what they wanted to see. Or, second, they purposely misrepresented the data for whatever reason they might have. Neither says anything good about them and argues for their listeners to find better options, starting yesterday.

The big problem is that these are the very same people that very often weigh in on other important topics such as climate change, evolution, pollution, medical matters, the interpretation of ecological data, etc. All of these topics are incredibly important to our society and have long-term and large effects not only on making policy decisions but, at times, on the very well-being of our citizens.

All these topics also require the use of rigorous and informed means of analyzing and interpreting the available data – and very often it is historical data such as polling data. There is no reason to believe these pundits will treat these data any better, or be better at understanding what it really means.

In fact, if you look into their record on these matters it is as bad or worse than their abysmal record on this year’s election. The cherry-picking and distortion of data is rampant. Scientists are constantly misquoted or quoted out of context in a way that is the opposite of their obvious intent.

People deserve better than this and it is time for many to find new and better sources of information. It is, especially, time for the media to stop giving these charlatans a free platform to spew their misinformation and/or misunderstanding.

There is always room in science for an informed debate based on facts, we just need better people providing the important counterpoint to the general consensus, and many are out there. They just tend to more introspective and often are willing to work towards a better consensus about what the data show.

Of course, this does not make for “good TV” because there is no rancor. We don’t need “good TV,” however, but need good information and analysis from many directions to be available so that we can make better informed decisions as a society. It is time that media outlets and their journalists stop being lazy and serve the people who they report to better.

CSTsiteisloaded