Friday, May 29, 2009
I've been looking at the Hall of Fame Probability indicator from Basketball-Reference.com with a bit of envy and I wanted to do the same thing for WNBA players. Since WNBA players have not appeared in great enough numbers yet at the Women's Basketball Hall of Fame to give an indication of the kinds of stats that make a Hall of Fame, I decided to look at the NBA model.
The creator of the predictor used a pool of 668 players. Of the 668 players, 78 had been elected to the (Men's Basketball) Hall of Fame and 590 had not. He then ran a statistical model - probably a multivariate regression using something which is a lot more powerful than an Excel spreadsheet. He found a formula that takes the following values
NBA points per game
NBA rebounds per game
NBA assists per game
NBA All-Star Game selections
NBA MVP "award shares"
NBA championships won
for any player whose final season was after 1959-60. The creator comes up with this formula.
Probability of Hall of Fame selection
Step 1: Find X, where X = a*height + b*PPG + c*RPG + d*APG + e*(all star selections) + f*(MVP award shares) + g*(championships). The values of a through f are in the link.
Step 2: Calculate e^X/(e^x + 1). The value, expressed as a number between 0 and 1, is the Hall of Fame probability.
My question: until we have enough WNBA Hall of Famers to determine a formula of our own, could we use this NBA's formula in the meantime?
There were two problems. The first is the fact that a NBA game is 48 minutes long, and a WNBA game is 40 minutes long. This means that points, assists, and rebounds per game would have to be expanded to a 48-point game for WNBA players. One could either multiply the per-game values by 48/40 - or, much more easily, multiply the game-based coefficients by 48/40 or 1.20.
The other problem is the height problem. We can't generalize NBA height to WNBA height. We would have to run our own complex height-based multivariate regression based on data we don't have. Therefore, we either have to throw height out of the equation, or equalize it.
Throwing height out of the equation in Step 1 might mess up Step 2 completely. Therefore, we will equalize instead and assume all players are "average height". The average NBA height is 6 feet 7 inches. We multiply (-0.20518 * 79) to get -16.209. The "height based part" of our equation is therefore always "equalized" to a number around -16.209.
We come up with a new formula:
WNBA Hall of Fame Probability Calculator
Step 1: Calculate X = -16.2 +
0.54 * points per game +
0.45 * rebounds per game +
0.47 * assists per game +
0.49 * number of All-Star/Olympic selections since 1997 +
3.18 * MVP shares +
1.03 * WNBA championships.
Step 2 : Calculate
Prob (WNBA Hall of Fame) = e ^ x / (1 + e ^ x)
e is the "Euler number" on the calculator. ^ means "to the power of". The above equation would be read "e to the power of x divided by quantity one plus e to the power of x").
(* * *)
Okay. We have a formula. But does it mean anything? We'll try to calculate Chamique Holdsclaw's Hall of Fame Probabililty.
Chamique's career totals are 17.66 ppg, 8.28 rpg and 2.6 apg. We get those from her career statistics.
For Number of All-Star selections, Holdsclaw's number is five (5). This number would include any appearances on your nation's Olympic team during a year when the All-Star Game wasn't played. This number only counts appearances from 1997 and beyond.
WNBA championships is simple. Holdsclaw has never appeared on a WNBA championship team. The number is zero.
Now we have this strange number called "MVP shares". It's a way to look at how popular a player was in MVP voting.
Example: Jane Doe was named WNBA MVP in 2010. She received 500 votes. Rhonda Roe received 100 votes for MVP that year. How many MVP shares did each player earn in 2010?
We consider the number of votes earned by the WNBA MVP winner for any given year to be the basis of a share. For the 2010 example above, 500 votes equals one WNBA MVP share. Therefore, Jane Doe receives 500/500 MVP Shares in 2010, or 1.0. (The MVP for the year always gets 1.0.) Rhonda Roe gets 100/500 = 0.2 of a share for that year - Rhonda's performance is "twenty percent" of the MVP's for the purposes of MVP voting.
The following quotients are number of votes Holdsclaw received divided by number of votes earned by the WNBA MVP winner that year.
The sum of all those fractions is 0.69. This is how many MVP Shares Chamique Holdsclaw has earned in her career.
Let's do the calcuation:
X = -16.2 + (0.54 * 17.66) + (0.45 * 8.28) + (0.47 * 2.6) + (0.49 * 5) + (3.18 * 0.69) + (1.03 * 0) = 2.93
e ^ 2.93 / (1 + e ^ 2.93) = 18.73/19.73 = 0.9493
According to the formula above, Holdsclaw has a 94.3 percent chance of being named to the WNBA Hall of Fame (if it existed) based on her current statistics.
I'd love to have a value of this metric for every WNBA player. I'll run some numbers for the Atlanta Dream roster to see what comes up.
1) If you follow the original link, you'll notice that there's a negative coefficient next to height. This makes sense, because this has the effect of punishing players for beyond-average height and rewarding players with below-average height.
2) Also note the large coefficient associated with MVP shares. This also makes sense, since the best measure of how good a player is should be reflected by the number of MVP votes they've received over their career. By definition, a Hall of Fame player should be someone who is thought to be "MVP worthy".
3) I would set a minimum of 175 games played to be even considered for an accurate calculation. The people at basketball-reference use 400 games, a somewhat equivalent number. Chamique's 225 games fit the bill.