Instead of offering definitive answers to the country’s biggest questions, the 2016 election results provoke even larger questions. How could the forecasters and the campaigns themselves been so wrong? What and maybe who did all the pollsters miss? Was there a late breaking voter phenomenon that was hard to measure?
These are critically important questions, and investigations are already underway. We may find answers to many of them; we may find others where the evidence will not be conclusive. But we need to let the chips fall where they may, including on my plate as a pollster with a newer platform.
I’m confident in one fact: we canvassed enough people. At SurveyStud we interviewed thousands over the course of the campaign, more than any other public source and more than the campaigns themselves. Our data consistently got big things right: like Donald Trump’s outsized support in the Midwest and Rust Belt.
But there’s another thing that’s absolutely clear to me, and it’s a categorical failing of data wonks trying to grasp certainty through research: too often we are guilty of failing to embrace uncertainty — specifically, the uncertainty embedded in the data itself.
Even our electoral map contained clues that the prevailing narrative was wrong. In our final 50-state map, we had Clinton with only 257 solid Electoral College votes, shy of the 270 needed to win and trending down from the 307 number we showed when we launched our daily tracking two weeks before Election Day. The rest were toss-ups. Our own data showed an open path for Trump. But the surface narrative lined up with the Clinton sweep that our own national numbers and everyone else was pointing to — from the New York Times to HuffPollster to the neuroscientist tapped as this year’s “Nate Silver.” Even the Trump senior adviser who told CNN on Election Day that it would “take a miracle for us to win. It all made the countervailing data points seem smaller than they should have appeared.
Now, people are asking if they can ever trust data again. In fact, we need data more than ever.
We will understand our numerical misses only by getting more information, doing deeper data analysis, and committing to even more rigorous efforts to constantly challenge our assumptions. Since pride tripped us up, humility may prove a better guide.
Some of the big areas for further exploration throughout the polling industry are already clear:
Likely voter models. Whatever the true magnitude of the errors this year, estimating who will actually cast a ballot — a future, yet-to-exist population — has long been a weak part of survey research. In the end, this year’s surveys may have collectively done a really good job at registering voter preferences, but where we all seemed to have slipped is in adequately gauging intent. It’s one thing to support a candidate in one’s head, with a yard sign, or on a survey. It’s another thing entirely to cast an actual ballot for that candidate. We need better, more reliable ways to bridge the gap between attitudes and actions.
Uncertainty estimates. We need better, more user-friendly ways to express the likely variability around polling estimates. More sophisticated poll consumers expect a “plus or minus” around our numbers, but even they latch onto the specific numbers we present. In a sports-dominated culture, numbers automatically become scores. A false sense of precision sets in when media narratives get created out of those numbers. For the most numerate out there, forecasters’ probabilities are a welcome and sufficient way to build in some doubt. Everyone else needs something more.
We are working on all of these challenges — and have the tools to tackle them. At SurveyStud, we’ve built a methodologically rigorous, smartphone-based program for political polls that rivals the best surveys around. But surveys need to be better, and we have the tools to improve our data even more. We’re already moving to develop advanced adjustments to our samples to handle a range of non-response issues; we will leverage external data to make more robust likely-voter models; and we will make everything, including possible errors, more comprehensible.
As usual if this seems plagiarized that’s because it probably is–so if you see something and I need to remove it let me know. Cool Beans…