Wednesday, March 25, 2009

Age vs. Ability in the WNBA, Part II

Janie Fincher of the WPBL has probably passed peak age.

If you've noticed in the comment section of yesterday's post, Rebkellian and all-around good guy pilight takes issue with the results. To quote:

"The problem is that you're only considering players who are still in the league at each given age. The only players who make rosters at 35 are those who are above average. Most players are long since out of the league [by age 35]."

In short, pilight's statement was that I had introduced a selection bias, which according to the fine folks at Wikipedia is "the distortion of a statistical analysis, due to the method of collecting samples". According to pilight, my particular form of error was a participant bias.

One can't look at all 35 year olds and claim that they'd be representative of any group of players that, by hook or crook, could theoretically reach 35 years old and be playing in the WNBA. All of the players who are age 35 in the WNBA are good players - they'd have to be to have lasted 12 + years in the league. When you compare 22 year old players to a group of 35 year old players, the 22 year old players will have some good, a lot of average, and many bad players but the 35 year olds generally won't have bad players left in it anymore. The groups being compared were unequal to begin with, and the conclusions drawn will be biased.

As someone once said, "Statistics is the most non-intuitive branch of mathematics." I found myself forced to agree with pilight - I would have to abandon my beloved hypothesis that WNBA players get better with age and seek some other approach. Maybe the new approach would yield the same conclusion...but maybe it wouldn't.

So what will be the new approach? What we'll do is compare selected groups of players across brief intervals of time. We'll compare a group of players who are age n (let n be whatever year of age you want) and then look at those same players at age (n+1).

We'll toss out any player from this group who didn't play 500 minutes in either year. Let's assume n = 23. We are looking at all of the players in the WNBA who played 500 minutes both at age 23 and at age 24. We compare their performances at age 23 with their performances at age 24.

If the group turned in better performances at age 24 than at 23, we have reason to conclude that your typical player will play better at age 24 than at age 23. If they turned in worse performances at age 24 than at 23, then we can conclude the opposite is true.

We do this for every pair of years for which we have data.


Start AgeEnd AgeCandidatesAverage Change in Wins Score

This begins to look like what we expect: a bell curve which rises to a certain point and then progressive declines.

Column F is the number of candidates. Note that there is only one WNBA player who played more than 500 minutes both at age 19 and age 20 - Ann Wauters. Likewise, Teresa Edwards is the only WNBA player who played more than 500 minutes both at age 38 and age 39. (This information does not look good for Sheryl Swoopes's employment prospects.)

Column G is the rise in average Wins Score per year. Aside from some little glitches, the year-per-year win score rises, sort of stays the same between age 27 and age 28, and then begins to decline year per year. This implies that the peak age for a WNBA player is...oh, I don't know, somewhere between age 27 and age 28. Afterwards, performance declines.

The only skewing of the results might be due to some players playing more minutes than others, even in the 500+ minute category. Wins Score is a linear metric - it rewards players for everything good they do and punishes them for every bad thing they do. Players who play a lot of minutes will tend to higher Wins Scores simply because they have more of an opportunity to accumulate points. The most minutes ever played in a WNBA season is around 1,200.

In general, the correlation between age progression and Win Score changes is about 0.94 for ages 21 to 27 and -0.81 for ages 28 to 35. Those are very good results.

So sadly (for me anyway), I was wrong and pilight is right: player performance begins to decline after age 28, and by age 38 players are at the far right end of the bell curve. All I managed to prove in my initial analysis is that the kinds of players that are still around after age 28 tend to be the better type of players. Looks like I learned three things:

a) that I was right if I redefined my question of my first analysis to "are the players still around at older ages the ones that have been historically the best"?
b) that pilight was right if I try to answer the question that I wanted to answer in the first place, which isn't the one in question a) and
c) a big heaping dose of humility.

1 comment:

Anonymous said...

That's a little better. I think the sample sizes are too small to say anything definitive, especially at the ends of the spectrum.