Friday, May 29, 2009

New Project: The Hall of Fame Projector



I've been looking at the Hall of Fame Probability indicator from Basketball-Reference.com with a bit of envy and I wanted to do the same thing for WNBA players. Since WNBA players have not appeared in great enough numbers yet at the Women's Basketball Hall of Fame to give an indication of the kinds of stats that make a Hall of Fame, I decided to look at the NBA model.

The creator of the predictor used a pool of 668 players. Of the 668 players, 78 had been elected to the (Men's Basketball) Hall of Fame and 590 had not. He then ran a statistical model - probably a multivariate regression using something which is a lot more powerful than an Excel spreadsheet. He found a formula that takes the following values

player height
NBA points per game
NBA rebounds per game
NBA assists per game
NBA All-Star Game selections
NBA MVP "award shares"
NBA championships won

for any player whose final season was after 1959-60. The creator comes up with this formula.

Probability of Hall of Fame selection

Step 1: Find X, where X = a*height + b*PPG + c*RPG + d*APG + e*(all star selections) + f*(MVP award shares) + g*(championships). The values of a through f are in the link.

Step 2: Calculate e^X/(e^x + 1). The value, expressed as a number between 0 and 1, is the Hall of Fame probability.

My question: until we have enough WNBA Hall of Famers to determine a formula of our own, could we use this NBA's formula in the meantime?

There were two problems. The first is the fact that a NBA game is 48 minutes long, and a WNBA game is 40 minutes long. This means that points, assists, and rebounds per game would have to be expanded to a 48-point game for WNBA players. One could either multiply the per-game values by 48/40 - or, much more easily, multiply the game-based coefficients by 48/40 or 1.20.

The other problem is the height problem. We can't generalize NBA height to WNBA height. We would have to run our own complex height-based multivariate regression based on data we don't have. Therefore, we either have to throw height out of the equation, or equalize it.

Throwing height out of the equation in Step 1 might mess up Step 2 completely. Therefore, we will equalize instead and assume all players are "average height". The average NBA height is 6 feet 7 inches. We multiply (-0.20518 * 79) to get -16.209. The "height based part" of our equation is therefore always "equalized" to a number around -16.209.

We come up with a new formula:

WNBA Hall of Fame Probability Calculator

Step 1: Calculate X = -16.2 +
0.54 * points per game +
0.45 * rebounds per game +
0.47 * assists per game +
0.49 * number of All-Star/Olympic selections since 1997 +
3.18 * MVP shares +
1.03 * WNBA championships.

Step 2 : Calculate

Prob (WNBA Hall of Fame) = e ^ x / (1 + e ^ x)

e is the "Euler number" on the calculator. ^ means "to the power of". The above equation would be read "e to the power of x divided by quantity one plus e to the power of x").

(* * *)

Okay. We have a formula. But does it mean anything? We'll try to calculate Chamique Holdsclaw's Hall of Fame Probabililty.

Chamique's career totals are 17.66 ppg, 8.28 rpg and 2.6 apg. We get those from her career statistics.

For Number of All-Star selections, Holdsclaw's number is five (5). This number would include any appearances on your nation's Olympic team during a year when the All-Star Game wasn't played. This number only counts appearances from 1997 and beyond.

WNBA championships is simple. Holdsclaw has never appeared on a WNBA championship team. The number is zero.

Now we have this strange number called "MVP shares". It's a way to look at how popular a player was in MVP voting.

Example: Jane Doe was named WNBA MVP in 2010. She received 500 votes. Rhonda Roe received 100 votes for MVP that year. How many MVP shares did each player earn in 2010?

We consider the number of votes earned by the WNBA MVP winner for any given year to be the basis of a share. For the 2010 example above, 500 votes equals one WNBA MVP share. Therefore, Jane Doe receives 500/500 MVP Shares in 2010, or 1.0. (The MVP for the year always gets 1.0.) Rhonda Roe gets 100/500 = 0.2 of a share for that year - Rhonda's performance is "twenty percent" of the MVP's for the purposes of MVP voting.

The following quotients are number of votes Holdsclaw received divided by number of votes earned by the WNBA MVP winner that year.

1999 24/397
2000 18/527
2001 8/563
2002 169/482
2003 71/406
2004 0/425
2005 17/327
2006 0/508
2007 0/473

The sum of all those fractions is 0.69. This is how many MVP Shares Chamique Holdsclaw has earned in her career.

Let's do the calcuation:

X = -16.2 + (0.54 * 17.66) + (0.45 * 8.28) + (0.47 * 2.6) + (0.49 * 5) + (3.18 * 0.69) + (1.03 * 0) = 2.93

e ^ 2.93 / (1 + e ^ 2.93) = 18.73/19.73 = 0.9493

According to the formula above, Holdsclaw has a 94.3 percent chance of being named to the WNBA Hall of Fame (if it existed) based on her current statistics.

I'd love to have a value of this metric for every WNBA player. I'll run some numbers for the Atlanta Dream roster to see what comes up.




Notes:

1) If you follow the original link, you'll notice that there's a negative coefficient next to height. This makes sense, because this has the effect of punishing players for beyond-average height and rewarding players with below-average height.

2) Also note the large coefficient associated with MVP shares. This also makes sense, since the best measure of how good a player is should be reflected by the number of MVP votes they've received over their career. By definition, a Hall of Fame player should be someone who is thought to be "MVP worthy".

3) I would set a minimum of 175 games played to be even considered for an accurate calculation. The people at basketball-reference use 400 games, a somewhat equivalent number. Chamique's 225 games fit the bill.

2 comments:

pilight said...

I played around with that once. I eventually realized that both the Naismith HOF and the WBHOF have very different standards for women than the Naismith has for men. You're basically figuring a percentage chance to get into a Hall that doesn't exist.

pt said...

Yup, pretty much. I figure that the "projection" is to a hypothetical Hall of Fame. And who knows, maybe we'll have one someday?