My old ScienceBlogs friend Mike Dunford has been tweeting his way through the latest lawsuit that’s attempting to overturn the results of our presidential election. The lawsuit is an amazingly shoddy piece of work. But one bit of it stuck out to me, because it falls into my area. Part of their argument tries to make the case that, based on "mathematical analysis", the reported vote counts couldn’t possibly make any sense.

The attached affidavit of Eric Quinell, Ph.D. ("Dr. Quinell Report) analyzez the extraordinary increase in turnout from 2016 to 2020 in a relatively small subset of townships and precincts outside of Detroit in Wayne County and Oakland county, and more importantly how nearly 100% or more of all "new" voters from 2016 to 2020 voted for Biden. See Exh. 102. Using publicly available information from Wayne County andOakland County, Dr. Quinell found that for the votes received up to the 2016 turnout levels, the 2020 vote Democrat vs Republican two-ways distributions (i.e. excluding third parties) tracked the 2016 Democrat vs. Republican distribution very closely…

This is very bad statistical analysis – it’s doing something which is absolutely never correct, which is guaranteed to produce a result that looks odd, and then pretending that the fact that you deliberately did something that will produce a certain result means that there’s something weird going on.

Let’s just make up a scenario with some numbers up to demonstrate. Let’s imagine a voting district in Cosine city. Cosine city has 1 million residents that are registered to vote.

In the 2016 election, let’s say that the election was dominated by two parties: the Radians, and the Degrees. The radians won 52% of the vote, and the Degrees won 48%. The voter turnout was low – just 45%.

Now, 2020 comes, and it’s a rematch of the Radians and the Degrees. But this time, the turnout was 50% of registered votes. The Degrees won, with 51% of the vote.

So let’s break that down into numbers for the two elections:

- In 2016:
- A total of 450,000 voters actually cast ballots.
- The Radians got 234,000 votes.
- The Degrees got 216,000 votes.

- In 2020:
- A total of 500,000 voters actually cast ballots.</li>
- The Radians got 245,000 votes.</li>
- The Degrees got 255,000 votes.</li>

Let’s do what Dr. Quinell did. Let’s look at the 2020 election numbers, and take out 450,000 votes which match the distribution from 2016. What we’re left with is:

- 11,000 new votes for the Radians, and
- 39,000 new votes for the Degrees.

There was a 3 percent shift in the vote, combined with an increase in voter turnout. Neither of those is unusual or radically surprising. But when you extract things in a statistically invalid way, we end up with a result that in a voting district which the vote for the two parties usually varies by no more than 4%, the "new votes" in this election went nearly 4:1 for one party.

If we reduced the increase in voter turnout, that ratio becomes significant worse.
If the election turnout was 46%, then the numbers would be 460,000 total votes; 225,400
for the Radians and 234,600 for the Degrees. With Dr. Quinell’s analysis, that would
give us: -9,000 votes for the Radians, and +18,000 votes for the Degrees. Or since
negative votes don’t make sense, we can just stop at 225,400, and say that *all* of the remaining votes,
every single new vote beyond what the Radians won last time, was taken by the Degrees. Clearly
impossible, it must be fraud!

So what’s the problem here? What caused this reasonable result to suddenly look incredibly unlikely?

The votes are one big pool of numbers. You don’t know which data points came from which
voters. You don’t know which voters are new versus old. What happened here is that the
bozo doing the analysis baked in an invalid assumption. He assumed that all of the voters
who voted in 2016 *voted the same way in 2020*.

"For the votes received up to the turnout level" isn’t something that’s actually measurable
in the data. It’s an *assertion* of something without evidence. You can’t break out
subgroups within a population, unless the subgroups were actually deliberately and carefully
measured when the data was gathered. And in the case of an election, the data that he’s
purportedly analyzing doesn’t actually contain the information needed to separate out that
group.

You can’t do that. Or rather you can, but the results are, at best, meaningless.