As American as apple pie, pie chart, that is.
On September 14, I published this poll chart below on the so-called national ballot for the House of Representatives from Pollster.com. It appears obvious to any observer that the Democrats (indicated by the blue trend line) were losing ground as fast the Republicans were gaining it. But looking inside the data, plus getting out the ol’ Excel spread sheet and doing some analysis of my own, I realized that the national poll was missing some key factors.
For one thing, the national poll aggregate is made up of individual state race polls and then computed using specific criteria applied by Pollster.com (the old adage that all races are local races is true). I also knew that the aggregate contained data compiled from a wide range of methodologies, as well as polls that were directly tied to political parties, who, even with the best intentions, will often introduce biases into their questions that favor positive responses for their candidates.
Here is the national aggregate House poll chart from September 14:
For the purposes of our discussion, ignore the earlier results. Just look at the roller-coaster for both parties since January 2010 through the present. May appears to be the moment of truth for the Republicans and they continued to increase in a nearly linear fashion from then on while the Democrats declined on a similar downward slope.
Not so Fast!
In my previous post on this topic, titled, Where the Wild Thing Are I made this comment:
If this chart was the only one you looked at you would conclude that the Republicans have made huge gains beginning about May 2010 and now hold a 47.1% to 40.6% lead over the Democrats. And you would be wrong. Something is missing. First of all what about the undecided voters? Where are they? How many of them are there? What is their trend? For that answer, click here.
In that post, I then led the reader through the process of using the Pollster.com User Tools to come up with a much different looking trend line because it eliminated all the polls that either were of questionable reliability or directly tied to a political party.
On September 26, I spent an evening working on generating some of my own statistics using the polling results from the Pollster.com website. Here is my Excel chart of the aggregate data, all polling groups included, my results came out at 47% Republican and 44% for the Democrats:
By carefully watching the movement of the poll results and tracking the changes in the gaps, I became more convinced that the trends were changing, that it was possible the Democratic candidates were gaining, although I could not estimate how much. One factor likely appeared to be the ending of the primaries, and the results from those races, if one ignored the pundits and the party-motivated spokespeople, I wanted to see what the trend was emerging. It was time to fire up the Excel and do a bunch of number crunching and running through the Chart Wizard. Except the new Excel doesn’t really have a chart wizard, so I fortunately know how to build the charts, having done it several thousand times having used one form or another of Excel since 1992.
At this point if you want to read the wonkish discussion and statistical analysis you can go to that page by clicking here.
The trend in the chart above confirmed my gut. There had been an upturn for the Democrats but also for the Republicans. One limitation of every chart is to decide what it means. A trend line, in this case a “moving average,” does give one a picture of change, but does not communicate what is pushing the change. The meaning, in one sense, is secondary. I was interested in the trend, because the dynamics pushing the trend begins with individuals. And as I pointed out in my post, The Black Poll Wars, Part II, the concept of one person, one vote no longer accurately describes the inner process of the American voter. Rather, a theory I dubbed “isovoting” is based on the assumption that,
The transformation of the vote into a compilation of isovotes [that is, the subpersonal meaning the person assigns to different issues that must be reasoned into a single vote on the ballot] is the key to understanding the American Electorate…The Uncertainty Principle [as defined by Heisenberg] shows that the isovotes cannot fit the Classical Statistical models for voter behavior. Like quarks in atoms, isovotes behave in dynamic ways that cannot be predicted with certainty either before or after they are observed, and that the very behavior of the survey taker will have a direct affect on the nature of the isovotes, especially with regard to the person assigning meaning to them, creating a new future for that person’s set of isovotes that did not exist prior to being polled on his or her preferences.
In short the uncertainty naturally built into the isovote process each person goes through when voting is too complex to discern, and components within the isovotes can change, sometimes affecting the others and sometimes not. Therefore, following the trending becomes the only reliable methodology to ascertain the what will possibly take place on November 2nd.
That trend is beginning to emerge, but with caveats discussed below:
This scatter plot with the trend line, covers the same length of time as the first chart in the post, so neither of them are as sensitive in representing the change over the past two months as the second chart I built using Excel. Unfortunately, the flash function of the Pollster.com chart cannot be copied onto this post. However, you can look at the same time frame, with all polling organizations represented by clicking here. The gap between the two parties has shrunk to 44.2% for the Republicans and 42.8% for the Democrats.
The results get even more interesting, though, when you eliminate the less statistically reliable polls (which I include as the internet polls and the robocall polls; the first being difficult to ensure true randomization, and the second on the basis it is easier to lie to a computer voice asking the questions than it is to a real interviewer).
Bungee Jumping With the Polls
The trend using this second set of criteria can be viewed by clicking here. The trend lines now have crossed with the Democrats taking the lead by a 45.8% to 44.5%. But whether this set of percentages is really good news for the Democrats depends on three factors. First, how many people are registered as democrats and will vote as a faithful member of the party. Second how many of those individuals will vote in the election. And third, the most difficult questions to answer is how many people who are not Democrats, who either identify themselves as Independents or are Republicans who plan to cross party lines with their vote, will vote Democratic. These caveats are not difficult to ascertain, but reading the subtleties of the trending, since it is always in flux is much harder to determine. Therefore, it is possible that despite a percentage majority showing in the polls, the party with the upper hand in terms of percentage may still end up losing more races than it wins.
Using the considerable resources of Pollster.com and the Gallup Polling organization, we can come up with some interesting speculation about the coming election. For example, we know roughly how many people are going to vote, 46.8 million Democrats and 46.4 million Republicans, a total of 93.2 million voters. The percentage difference is 50.2% (D) to 49.7% (R). That’s a tiny difference of only 466,000 voters compared to the national scale. But that analysis is actually not correct, because these numbers represent the categories of voters, D, R, and I that will vote either Republican or Democratic. Tucked inside the party’s totals are 14.7 million Independents who will vote Democratic and 18.9 million who will likely vote Republican in this election. That is much larger gap of 4.2 points in the favor of the GOP. Another factor we can look at is registered voters, who are more likely to vote, and numerous polls distinguish between registered and likely voters. I analyzed the polls that interviewed registered voters and came up with 31 surveys. Plotting out those surveys, I came up with the following chart:
Note: for an explanation of the R² number, please click here
UPDATE: Since I wrote the post I came across this very illuminating article on the issue of choosing “likely voters” in contrast to “registered voters” as the survey sample on the Huffpost Pollster (The Huffington Post has just acquired Pollster.com and integrated its sites into Huffington’s), by Mark Blumenthal (who originally founded Pollster.com). I recommend you read through his article. He gives a nicely framed explanation of how pollsters choose who to survey and it is written for the general reader: “Likely Voters: How Pollsters Define and Choose Them.”
After reading Blumenthal’s article, I recalibrated the filters on the National Congressional Ballot on the Pollster site to only include those who surveyed registered voters. To see the result, click here. The results contradict the unfiltered chart that shows the Republicans up by over 7 points. Instead by looking at the registered voters (which I hold are still in the highest percentage of all voters) the Republicans hold the thinnest of margin at 45.3% over the Democrat’s 45.0% Statistically speaking this is a virtual tie.
Is the trend line good news for the Democrats? Yes and no. Any time one party gains ground and passes the other in the number of people who say they will vote for them, that is cause for at least cautious optimism. But looking at the visual slope of the lines in the chart above does not indicate the Republicans have begun to dramatically slump. A week from now, they could just as easily stopped the small downward slope and recovered to move above the Democrats again. The positive factor for the Democrats is the R² of their trend is significantly stronger than the Republicans. In other words, it may be evidence of more “oomph” behind the upward change in direction.
We are down to three weeks and counting. The fun continues unabated.