Wednesday, July 15, 2009
A bunch of stat-heads have been tossing the ball around regarding Diana Taurasi's jump from front-runner to third place within one week of All-Star voting. (I believe the time frame between the first round of voting and the final round was one wek.)
My first impulse was that the shift in voting was clearly a conspiracy of the WNBA to make sure that Taurasi wasn't voted a starter. Taurasi, as probably everyone knows by now, was arrested for drunk driving and was found to have a 0.17 blood alcohol level. The conspiracy-theory concludes that the W didn't want Taurasi as their headliner on one of the occasions where the media would turn its brief attention to the WNBA.
However, is this really true or not? Can numbers tell us anything about whether such a change is possible?
In the first round of voting, Taurasi was leading all of the Western guards with 21.6 percent of the vote. Suppose we wanted to find, say, the 95 percent confidence interval of Taurasi's true percentage of votes. According to simple stats,
we should be 95 percent confident that Taurasi's true percentage should be between 20.3 percent and 22.9 percent.
By that time, there had been around 150,000 votes cast. 150,000 votes is a good sample size. If you have 150,000 votes in, that sample size is good enough to determine the probabilities to about 1 or 2 percent or so. Taurasi was clearly in the lead, with Sue Bird far behind at 14.4 percent.
But at the end of the voting, 600,000 votes were cast and Taurasi's 21.6 percent among guards had plummeted to 12.3 percent. Even if we were outside of the 95 percent confidence interval, Taurasi's true percentage shouldn't have been that far off. What the hell happened?
Conspiracy! Clearly, the WNBA simply fudged the vote count.
Or did it? The problem with the first sample - the sample with 150,000 votes - is that is might not have been a random sample. If the sample was biased, we have to rethink the results. Concluding that Taurasi's true percentage was 21.6 percent might be akin to concluding that John McCain's true percentage in the 2008 presidential election was 55.6 percent...simply from counting Texas's votes.
The first returns of the ballots were on July 2nd. Let's look at WNBA schedules since then:
Home Games Played Before and After July 2, 2009
San Antonio: 4/2
We notice that Phoenix's game schedule before July 2nd is top heavy with home games. Let's make an assumption for which we have no proof - namely, that voting is heavier at the arena than it is on line. If this is true, Phoenix players had an advantage at the first round that couldn't be seen in the totals. Phoenix fans inadvertantly stuffed the ballot box for their favorite players.
However, Phoenix would have only one home game between the first round results and the final results. Seattle would have three home games and San Antonio would have two. In short, Seattle and San Antonio had opportunities to catch up with arena-based fan voting that Phoenix just couldn't match.
If this hypothesis is true, it should affect Phoenix players all across the board and not just Taurasi.
Phoenix Player Percentages Before and After First Round
T. Johnson: 11.6/10.2
T. Smith: 21.9/12.6
The numbers hint at about a 6 percent (or more) drop at every single position except for Temeka Johnson. This doesn't prove the hypothesis, but it grants it some support.
Now look at Seattle:
Seattle Player Percentages Before and After First Round:
In every case for Seattle (except for Janell Burse) the percentage of votes remained pretty much the same between rounds. The Storm had three home games before the end of the first round, and three home games after it. It leads further credence to the hypothesis that first round voting percentages for a particular player and number of home games played by the player's team have a strong correlation.
I think I'm convinced. The only conspiracy was in the schedule that gave Phoenix only one home game between the end of the first round and the final All-Star vote. In short, there was no conspiracy.
Of course, you can still ask, "Why did Taurasi's numbers drop the most?" It could be one of two reasons: either the kind of voters who vote from arenas (as opposed to on-line) don't like Taurasi as much as Phoenix fans like her, or that there was simply an additional drop off caused by the notoriety of Taurasi's DUI arrest. Combine those two untested hypotheses with the illustrations above, and the "mystery" solves itself.