Monday, November 7, 2016

The US Presidential election is tomorrow. Here's the final run of my Monte Carlo Presidential Election Simulator (code here):

Election 2016 Monte Carlo Simulation
Run date: Monday Nov 7 2016 14:59:07
There are 1 days until the election. 
Collecting survey data for the great states of NE, ND, MS, ID, SD, IN, MA, KY, LA, AR, RI, MT, OR, GA, AL, OH, MI, IA, HI, DC, PA, WI, MD, CT, FL, NM, WV, UT, KS, CO, TX, NJ, DE, OK, MO, SC, NV, AZ, NY, VT, ME, IL, TN, WY, VA, WA, AK, MN, NH, CA, NC. 
Swing States:
Probability of Clinton winning OH: 10.44%
Probability of Clinton winning IA: 0.02%
Probability of Clinton winning PA: 99.72%
Probability of Clinton winning WI: 97.13%
Probability of Clinton winning FL: 88.54%
Probability of Clinton winning CO: 99.84%
Probability of Clinton winning NV: 69.71%
Probability of Clinton winning VA: 100.00%
Probability of Clinton winning NH: 100.00%
Probability of Clinton winning NC: 99.94%
12 states have no polls, so were assigned 2012 outcomes 
Clinton election probability: 99.97%
Trump election probability: 0.03%
Average electoral votes for Clinton: 323
Average electoral votes for Trump: 215

The results have been consistent throughout the election season. At no point did Trump have any probability of winning. Florida has moved back and forth; as of today, it is strongly predicted to go to Clinton.

One note: this simulation has consistently predicted greater certainty of a Clinton win (99.97%) than other electoral college simulators such as the one at FiveThirtyEight's Election Forecast (69.2%), Daily Kos Election 2016 (88%). It turns out that there's a reason for that: some of these sites put their finger on the scale!

From this article, How bad is it for Donald Trump? Let's do the math:

These histograms—and the chances of Clinton winning—are different from what each model is actually reporting as their national-level forecast because, like us, none of the other forecasters assume that state election outcomes are independent. If the polls are wrong, or if there’s a national swing in voter preferences toward Trump, then his odds should increase in many states at once: Nevada, Ohio, Florida, and so forth.

This is saying that the data and modeling are providing a clear result, but the forecasters are modifying the results because, they say, the separate state models shouldn't be treated as independent. So they modify the results to include other states' results.

That's a nonsensical thing to do for a couple of reasons. First, all it does is regress results to the mean. But of course, there's no precise way to guess how much one state's vote should influence another, so this isn't modeling; it's guessing, straying away from the data and the science to influence the result in an unprincipled way.

Worse, though, such adjustments forget that the data already contains the non-independent data. When voters answer pollsters' questions, they already know about national polls, national news, candidates' news, and as much statewide and nearby state news as they're interested to know. Of course they factor all of this available information into their answers to pollsters about their candidate choice. There's no need to try to add it in later.

The Huffington Post's 2016 President Forecast, which provides the polling data for my simulation, is closer at 98.2%. That could be because we're using the same data, but also likely means they aren't including national polls or other non-state data in their state probability estimates. Both Huffington Post and my simulation predict 323 Electoral College votes for Clinton.