January 27, 2008

A description of Win Score, valuable for future posts

This whole upcoming series of posts all stemmed from the UCLA loss to USC at Pauley last week. After having told nearly everybody I knew that Davon Jefferson was actually a far more valuable player to the Trojans than O.J. Mayo, he went out and dropped 25 on the Bruins, and was pretty clearly the player of the game that night. I felt vindicated in a sense, though obviously extremely hollow inside. But it did get me to thinking – just how much more valuable IS Davon Jefferson than O.J. Mayo?

Which brings me to these posts. Over the next couple of days, I’ll be putting up a few entries in which I analyze the local collegiate basketball teams using some modern-ish statistical analysis. This particular post will cover the concept of ‘Win Score,’ which will be the main analytical tool used. I know, “Warning: Numbers!” and all, but if you want to understand the posts that follow you probably need at least this primer’s worth of info.

To begin with requires a bit of background on David Berri, the economist I referenced briefly in my previous post, who co-authored the book The Wages of Wins. He also writes a stats-oriented blog, The Wages of Wins Journal. Much of his work is based around his formula for ‘Wins Produced,’ a complicated mathematical formula that he describes in The Wages of Wins. Its goal is to quantify player performance, so that all statistics can then be combined into one value that gives an approximate accounting for that player’s worth to his team. Though I have my concerns with applying statistical analysis to a sport like basketball (as opposed to its natural sporting environment of baseball) – and additional worries with Berri’s system in particular – the idea in itself is elegant, and surprisingly accurate. Given a full list of a team’s players, stats, and minutes played, when plugged into his formulas, they generally spit out a ‘team wins/losses’ that is within a very close margin to the actual record. So he’s on the right track, at the very least.

Now, that’s all well and good, but Wins Produced is very complex. To help alleviate that, Berri devised a more simple system, that he called Win Scores. As he describes it, "Win Score is designed to be a simple metric that allows one to see quickly if a player had a good or bad game. And for research where you only wish to compare a player’s current performance to his past performance, Win Score is perfectly suited for such a task." Win Scores are built around similar concepts to the more complex Wins Produced, only the math is far less involved. It's not a perfect substitute for comparing players, but it'll do. The basic gist is that players who contribute in all facets of the game, and do so efficiently (e.g. high field-goal/free-throw percentage, good assist-turnover ratio, rarely foul), are the most valuable. Seems pretty simple, and it makes a fair deal of sense. In effect:

Tenet 1: More things matter than just scoring.
Tenet 2: Anybody can score at least 25 points in a game if he shoots the ball 60 times, but his team isn’t going to do very well.

That’s a bit simple, but it’s the basics. As such, it’s fairly easy to develop statistical models if the right data is available. Fortunately, with the advent of the internet and fantasy sports, statistical data is only a mouse click away. Armed now with the statistics I needed and a simple, fairly handy model, I decided to take a look at just how much Mayo really added to his team, at least by this particular metric’s reckoning, which will be coming tomorrow.

No comments: