Does winning matter for Hall of Fame Induction (part 4)?
(see part 1, part 2 and part 3 in previous posts)
Below are the current and recently retired players with predicted probabilities over .5, as well as a few other players of interest. Players who have been elected are denoted “*” and players who are eligible are denoted “+”.
The real surprises here are the players at the end of the list, several of whom might be first-ballot hall of famers were there no steroids controversy. There are also several players who are generally considered "borderline" candidates who score particularly well. I’m not sure exactly what to make of the seemingly poor performance of the models in addressing the performance of recent players.
The other interesting result is the difference in the two specifications for players who most their entire career with good teams (e.g. Jeter, Manny, Chipper) or poor teams (e.g. Dawson, Sosa, Sandberg – why are they all Cubs?). I think this definitely suggests that winning makes a difference, though I think model 3 might overstate its impact.
Next time I’ll look at how changes in productivity or team performance might have affected players’ hall of fame chances. Due to other projects, this may not happen until sometime next week.
| Prob. Of Induction | ||||
| Name | endyr | position | Model 1 | Model 3 |
| Barry Bonds | 2006 | OF | 100% | 100% |
| Rickey Henderson | 2003 | OF | 100% | 100% |
| *Cal Ripken | 2001 | SS | 98.9% | 98.3% |
| *Dave Winfield | 1995 | OF | 96.2% | 95.0% |
| *Eddie Murray | 1997 | 1B | 96.0% | 97.9% |
| Craig Biggio | 2006 | 2B | 94.1% | 94.3% |
| Frank Thomas | 2006 | 1B | 92.8% | 96.4% |
| Rafael Palmeiro | 2005 | 1B | 91.1% | 90.0% |
| Ken Griffey | 2006 | OF | 89.2% | 73.8% |
| *Paul Molitor | 1998 | 3B | 86.2% | 93.6% |
| Roberto Alomar | 2004 | 2B | 86.1% | 82.9% |
| Jeff Bagwell | 2005 | 1B | 85.0% | 85.4% |
| *Tony Gwynn | 2001 | OF | 83.2% | 76.3% |
| Tim Raines | 2002 | OF | 80.4% | 84.9% |
| Alex Rodriguez | 2006 | SS | 78.3% | 57.7% |
| *Wade Boggs | 1999 | 3B | 77.8% | 89.3% |
| +Andre Dawson | 1996 | OF | 74.4% | 58.0% |
| Barry Larkin | 2004 | SS | 71.6% | 70.9% |
| Gary Sheffield | 2006 | OF | 71.0% | 74.4% |
| *Ozzie Smith | 1996 | SS | 68.7% | 68.0% |
| *Ryne Sandberg | 1997 | 2B | 68.4% | 34.9% |
| Ivan Rodriguez | 2006 | C | 66.7% | 69.5% |
| Fred McGriff | 2004 | 1B | 54.9% | 53.8% |
| Larry Walker | 2005 | OF | 54.6% | 38.7% |
| Sammy Sosa | 2005 | OF | 46.1% | 25.1% |
| Mike Piazza | 2006 | C | 45.6% | 47.2% |
| Chipper Jones | 2006 | 3B | 24.1% | 58.6% |
| Derek Jeter | 2006 | SS | 21.1% | 58.8% |
| +Mark McGwire | 2001 | 1B | 21.1% | 22.9% |
| Manny Ramirez | 2006 | OF | 19.3% | 54.9% |

2 Comments:
Interesting analysis. A few suggestions for your model:
1) Most important: only count HOFers chosen by the writers. That is the standard we care about, and including Vet Comm choices really lowers the standards. That's the main reason so many of your contemporary players are getting unrealistically high probability scores.
2) You need some measure of how brightly the candle burned, not just how long. The same linear weights total over 14 years is a very different player than spread over 19 years. Easiest thing is something like OPS or OPS+.
3) It would also help to capture peak performance better. If you have top-5 or top-10 MVP (not just winner), would probably strengthen the model. And add some variables that measure dominance in key categories, such as "Black Ink" and/or Silver sluggers.
4) Piazza's absurdly low rating makes me think that the position dummies aren't doing as much as you want them to. Better would be position-adjusting each hitter's OPS+. (But that would be a lot of work.)
To follow up, you won't be sure if your team win% and World Series variables really matter until your model better captures players' rate value (as opposed to cumulative value). Otherwise, it's possible those 2 variables are capturing these other elements.
Thinking some more about this, you might want to include OBP+ and SLG+ separately, rather than OPS+, in that HOF voters may value SLG more highly.
Small point: It might be better to use AB/PA or G, rather than seasons. It will help value players who missed a lot of playing time, like Larkin and Larry Walker, more accurately.
Post a Comment
<< Home