Saturday, August 16, 2008
(Note: the following post might be stat-heavy and dwell obsessively on number crunching. If you have no interest in such things, you might just want to skip past this one.)
Q of Rethinking Basketball asked me in a comment about a post I had written which used the first version of the Senior Prospects Metric (SPM) to go back in time to the 2008 WNBA Draft and attempt to predict that draft. Here are the results:
So how did the SPM match up with how the players performed? And how well did the actual draft match up? The actual draft matched up very well, just creeping into the -0.5 area of large correlation. Whereas the SPM did not even reach -0.3, the generally hypothesized level of medium correlation.
Q suggested that instead of using Wins Score, I should use a metric that was more of a rate than a sum. Wins Score isn't high for players that don't get a lot of minutes, whereas a rate would measure "production per minute" and would not penalize players that had potential but didn't get a lot of minutes.
Therefore, instead of using Wins Score, I would use a metric I was already using for something else, called WS333 - "Wins Score for 33 1/3 minutes". WS333 asked the question, "how much Wins Score does the player produce per 33 1/3 minutes of play?". It would shrink the scores of players who had a lot of minutes, and expand the scores of players who didn't.
The results were...well....
AAAGGGHHH! When using a per/minute based metric, the SPM did worse - it got closer to randomness. Whereas the actual draft order got closer to a direct relationship.
I realized that I was going to have to add the height caveats that Hollinger wrote about in his inital ESPN article if I was to get any closer to an actual draft projection. I therefore added the height qualifications to the SPM and tried the entire result all over again.
Here are the final results. The "New Predicted Order" is the order you'd get if you use the updated version of the SPM to predict the draft outcome.
Kimberly Beck falls from 3rd to 10th. The new SPM punishes her for being a short point guard who didn't rebound well in college. Quianna Chaney also takes a hit for rebounding. Poor players get pushed down and the better players move up to take their places. Candace Wiggins moves to 8th from 10th, and Alexis Hornbuck moves all the way from 15th to 9th.
We'll look at the new correlations now.
The newest version of the SPM still does poorly against WS333 - but it does better than the older version. However, when using Wins Score, the correlation between SPM and actual player results actually moves from small correlation to medium correlation.
Even more interesting (to me anyway) is that the new SPM predicted order moves closer in correlation to the order from the 2008 WNBA Draft. It's still a small correlation, but it moves more in the direction of what the GMS of the WNBA actually do.
(* * *)
When thinking about this project, there were two matters on my mind.
The first matter is whether or not correlation is the correct statistical measure to use. As it turns out, correlation can be very misleading whenever non-normal variables are used.
A "normal variable" is a bell-curve type distribution. This implies a lot of numbers clumped up in the middle and trailing on both ends. If WNBA talent were normally distributed, it would indicate that most WNBA players have roughly the same amount of talent, with a few players being rather crappy and another few players being excellent.
This assumption is not true in baseball. In baseball, it's more of an exponential curve, with a whole lot of really crappy players and a small number of good players. (Go to this link for the image of an exponential curve.) I suspect that talent curves are the same way in the WNBA and this adds problems when using correlation.
The other matter on my mind is that we might not be explaining the relationship in the right terms - it might not be that draft position predicts player performance, but that player performance might be dependent on draft position! It might be the case that higher draft choices get more coaching attention to improving the fine points of their game than lower, non-playing draft choices. In short, there might not be some sort of "inherent ability" that never changes, but rather the ability is more likely to be coaxed out at the professional level by coaches and staff members the lower one is picked in the draft.
After all, if you're a first round draft pick, there's more pressure on GMs and coaches to get a better result. More attention is lavished on one's major investment than one's minor ones.
That's pretty much it for the conclusions. I hope to have an updated SPM at the end of the 2008-09 college season, so we'll see which players did better than the most recently concluded year, and which did worse. If your favorite college player does worse in the coming year, don't worry - Hollinger says that this happens in men's college basketall, and it will probably happen in the women's game as well.