Thursday, August 7, 2008

2008-09 - The Best Senior Prospects



I've now concluded my analysis of the current Division I women's basketball prospects. All of these women will be seniors during the 2008-09 season. For blog readers which may be math-phobic, you might want to skip down to the very end where the players are actually named.

Why do an analysis?

Good question. The point is basically to try to determine which players deserve a closer look.

Whenever someone who likes statistics creates a ranked list, contention is the outcome. "How dare you claim that Player X is #16 when it's plain that X is better than Player Y who is #6?"

The goal is to look both with the eyes and with the stats. Your eyes can actually fool you as much as statistics can. In order to really know which players are the best, you would have to watch each of the named players for the entire college season. This is impossible for all but maybe a handful of reporters, and even they can only watch so many players in the day. Statistics do two things:

a) they isolate players that have distinguished themselves in some way in the boxscores and,
b) they point to a player's flaws as well as a player's strengths.

The Starting Point

I began with Hollinger’s idea that blocks, steals, rebounds, and three point shooting were important in a prospect. I determined that we should look at NCAA Division I players who were juniors in the 2007-08 season. Only players who were in the Top 100 players in any of the four categories indicated above should be considered. This left me with 108 players to look at, and I was able to locate statistics for 105 of those players. (Florida A & M and St. Mary’s (California), I’m pointing the finger at you.)

Ashley Paris, sister of Courtney Paris, was mentioned in a previous comment, so I added her to my list. As my attention is drawn to other players, I'll add them to my analysis.

Blocks, Forwards, Rebounds

One expects forwards and centers to be able to get rebounds, with guards at a disadvantage. All players should be able to get steals, with guards (being more agile and quick) having the best shot. Three point shooting would be primarily a guard skill.

I decided that the scale for blocks, steals, and rebounds would be linear. It would be based on number of blocks per game, steals per game, and rebounds per game. A good player should be good in all of these defensive skills at the college level. Therefore, these three factors were multiplied together. Ed Weiland has a stat called RSB40, which he uses as a multiplicative factor so I feel that I'm on the right track.

This decision hurts guards – because we don’t expect them to rebound - but the guards get a chance to be rewarded later.

Furthermore, the factor for steals was doubled. Hollinger weighted steals more heavily than any other factor in his analysis, and the numbers that come out at the end match up with “common sense” when I make the same decision.

The problem is that at some point in the analysis, you have to decide what number of blocks, steals, or rebounds per game corresponds to a factor = 1.0. Looking over the top 100 finishers in these categories, I made the following decisions:

1.5 blocks per game = factor of 1.0
2.2 steals per game = factor of 2.0 (remember, the factor for steals is doubled)
8.1 rebounds per game = factor of 1.0.

If you want different results, set different factors. I'm going to stand on these values, which would be pretty impressive if they were per-game.

Three-point Shooting

We ranked players by percentage, not by number of shots made. 0.343 percentage = 1.00 factor. Players were required to make at least an average of two three-pointers a game to sort out those people with high three-point shooting but low numbers of baskets – if they didn’t meet the average qualification, they got a zero factor.

Age

The age of players is definitely important, because older players are closer to their peaks. The youngest player among the candidates had a 11/18/1987 birthday. I began penalizing players who had birthdates before 11/18/1986. A person born a quarter of a year before that date would have 0.25 removed from the final score; a person born on 11/18/85 would have 1.0 removed from the final score. A player closer to their eventual peak should be devalued.

For some players, I was not able to find their ages. Those players were not penalized in the system. There was one player who was born in “January 1987”, I assigned her a birthdate of 1/15/87. She didn't suffer any age penalty according to my current rules.

Wins Efficiency

Certainly, we should reward players who can score, but by how much? There are so many metrics like PER and Efficiency. I don’t agree with “Efficiency” because I believe it overvalues crappy shooting. I therefore used the “Wins Score” metric, but divided it by Games Played – Wins Score is not divided by Games Played – to create a more Efficiency-like metric called “Wins Efficiency”. I divided the final Wins Efficiency results by 10 to award points. A player with a score of 10 in “Wins Efficiency” gets 1.00 point.

WPPR

This is a variation on Hollinger’s Pure Point Rating. The problem with Pure Point Rating is that if you apply the formula to WNBA players, they have terrible values. The reason is because in the NBA, the average team’s amount of turnovers is 2/3 of the amount of its assists. In the WNBA, players turn the ball over more and for an average team, turnovers and assists are equal. I simply removed the 2/3 multiplicative factor from Hollinger’s formula. In the NBA, assist/turnover ratio is misleading as to a player’s true value as a guard; in the WNBA assist/turnover ratio is more accurate.

My variation – WPPR – rewards players who have more assists than turnovers. A good guard will have a high WPPR and all players are rewarded 1/5 of their WPPR if the value is positive. In general, this decision rewards guards and punishes forwards and centers (who have more turnovers than assists), but the best forwards and centers at the college level can avoid turnovers. Still, they’ll have a negative number. If a forward or center has a negative WPPR, 1/10th of the WPPR is removed from the final score.

However, a guard who has a negative WPPR in the women’s game is someone whom one should be wary of. If a guard has a negative WPPR, 1/5th of the WPPR is removed from the final score.

50-50 players

Fifty-fifty players are players who have 50 blocks and 50 steals in a season in the men’s college games turn out to be excellent college players, according to Hollinger. Players are given a special 1.00 factor bonus for meeting this benchmark. Only two players qualify for this benchmark in the 2007-08 season: Demetress Adams of South Carolina and Jessica Bobbitt of Belmont. This is an additive factor, so no one is penalized for not reaching the mark. (Except for maybe Chante Black of Duke, who missed the 50-50 mark by one steal. I decided not to give her the point; your mileage may very.)

Conference Strength

Players were rewarded for playing against good teams and punished for playing against bad teams. The final score is multiplied by a factor equivalent to the strength of their conference. Conference strengths are determined by Jeff Sagarin’s ratings for women’s NCAA basketball. The central mean method was used, with the Big Twelve’s 86.55 rating converted to a 1.00 factor, and partial ratings converted linearly.

Final Factor

Equals:

blocked factor*steals factors (which is x2) * rebounds factor
plus three point factor
plus age factor (negative for older players)
plus Wins Efficiency factor
plus PPR factor
plus 50-50 factor….

all multiplied by conference strength factor.

Known Flaws

Hollinger points to several "red flags" that might disqualify a player. One of the flaws is the flaw of a player not receiving a certain number of rebounds per height. Hollinger states that "a player in the X range of height should receive at least Y rebounds per game". I wasn't able to do create these height categories because I don't have enough information about the range of heights in the NBA (and WNBA) to determine the appropriate cutoff points.

Furthermore, schools play at many different "paces". The word "pace" has a specific meaning in the APBRmetric community. I was not able to obtain this information and was not willing to calculate it for, oh, about 300 schools. Maybe someday, there will be a master women's college database. Until then, the speed at which schools play, the number of possessions per team, etc. is not taken into account.

(* * *)

So who are the leaders after all this number crunching? Lets look at the top twenty overall players:

Top Twenty Overall Players



We have Courtney Paris of Oklahoma as number one, and that's a good start. I'd expect Angel McCoughtry to be number two, but she ends up as #8 on the metric. However, Kristi Toliver at #2 should be no surprise.

I wonder if my formula gives too much credit to guards. Then again, I have approximately 20 centers, 30 forwards and fifty guards on my list so I shouldn't be surprised that guards are overrepresented. I would say the only surprises are Kristi Cirone of Illinois State and Jenna Schone of Miami of Ohio, both from small schools.

Top Ten Centers



I'm very surprised that the quality of centers drops of quickly after Courtney Paris. I don't know if centers get short shrift for having low WPPR or for the fact that most of the centers in women's college basketball don't seem to produce a lot of Wins Efficiency.

Top Twenty Forwards



Forwards also drop off quickly, although not as quickly as centers do. There are four good forwards to be had early in the draft. Most of the good centers come from the big schools.

Top Twenty Guards



Guards are the quality position in the metric. Even down at the twentieth position, there are decent guards. Furthermore, the small schools (Tiera DeLaHoussay at Western Michigan, Shantia Grace at South Florida) provide a large number of great guards. If you want to sneak up on someone in the draft, you can probably get a quality guard from an overlooked school.

I also wondered if guards are the prime position for players to make their presence known in women's basketball. The talent level at the women's college game is thinner than in the men's game, and it's more likely that tall (but not truly athletic) women have been pushed into the post positions in high school.

(* * *)

I'll try to keep this list updated, and to have the senior stats analyzed at the end of the 2009 season. Oddly enough, Hollinger says that good NBA players have a decrease in their statistics between their junior and senior years. So if one of your favorite juniors has a bad year...well, don't lose hope. And if one of your favorite juniors isn't ranked high on my list...well, 2008-09 has the power to change everything.

5 comments:

Q said...

I know very little about women's college bb players (don't pay attention till tourney time), but this seems like a great analysis...and a great tool for fans to watch the season with.

A few comments:

I wouldn't think your formula gives too much credit to guards because winscore favors rebounders, which obviously favors bigger players...so I think there's balance here.

Would be interesting to compare this data to last year's class and see what you find -- if Candace Parker, Syliva Fowles, or Candice Wiggins weren't on top for some reason, that might tell you what you're over/under weighting...

Anonymous said...

That was my first thought - let's take the same formula to last year's class...

The most interesting thing that you learn on the inside of a program that can't be factored in stats is the intangible team chemistry/personality/off-court hassle factor, which is impossible to know unless you are close to a program and its players...a great example is a player you would know being in the Atlanta area which was Tasha Humphrey, who has a lot of talent but also a lot of issues and thus dropped farther in the draft than many expected (and despite playing well at times is now being shopped without much other team interest)...I think that may be the toughest thing for fans to understand when they see a team trade or cut someone who on the face of it looks promising....

Anonymous said...

I have to say that I would take age out of the math...most of the players are not at all close to their peak whether "young" or "old" in their senior year, and while it may not have really changed your math it just does not make a lot of sense to me as a significant factor (factoring in past history of injuries would be far more interesting although not mathematical).

That issue about height is very important that you mentioned at the end...and courtney paris is an interesting case study to watch as she gets into the league next year even though if someone can overcome the height issue she might be the one to do it. Crystal Langhorne would be interesting to see in your formula from last year - she was impressive in college but her game looked very suspect looking forward to the W given that she would have to be able to create against taller players often. Jury is still out on whether she will prove able to do it or not to justify how high she went in the draft. I think we will slowly see same evolution we saw in mens game where average heights at all positions except maybe PG start trending higher.

pt said...

anon, you happened to read my mind. Guess it's time to release the next post, which applies to personality issues....

pt said...

anon II,

I really regret not being able to put the height cutoffs in the analysis. There's so much I wanted to do but was unable to. If I can get height-related data I'll definitely put it in the next analysis.

I just copycatted Hollinger's theories. Angel McCoughtry, for example, took a hit because of age -- and maybe she didn't deserve to take that hit. She'd probably be in the top 5 or top 3 without the age penalty. I still really don't know enough about how the age issue reflects player comparison, but I didn't have any good reason to remove it at the time.