Which NBA teams are the worst without their best player?
The MVP discussion in the NBA has been pretty contentious lately, with a fair few names in the pot: Jokic, Embiid, Curry, even Julius Randle are all being considered (though Jokic is the outstanding frontrunner). The arguments that can be made for each player are far and wide, aided by the fact that there is no set criteria for what exactly constitutes a player being the “most valuable” in the NBA. Discussing the pros and cons of what exactly the basis for the MVP award is a pain, but I thought of an interesting hypothetical— what if you removed the best player from each team entirely from the NBA? What if, after the season is over, the player whose literal impact on their team’s games could be dubbed the “most valuable player” in the league?
It sounds pretty simple, but this gets pretty complex in a game like basketball, where there are a lot of ways that a player can contribute to the game. The easy one is points. Take a game, remove the points from the best player on both teams, and the new score determines who had a larger impact— if the game’s result stayed the same, they had a smaller impact (they could’ve won the game even without him playing), but if the game flipped, then the player on the now-losing side would be a more valuable player (because his contributions were removed, the team now loses the game). This is obviously a silly way of measuring “impact”, but let’s roll with it because I think it’s an interesting hypothetical. But there are a couple things to decide before we can start.
First, who is the best player on each team? This is a straightforward question for some teams, less so for others. Let’s use the 2019/20 season first: here’s my take at every team’s best player in that season:
Some teams had two players that could both claim to be the best player on the team (like the Utah Jazz, with both Gobert and Mitchell that could make a solid case), so I used games played as a tiebreaker— after all, availability is the best ability. If games played were tied (Terry Rozier and Devonte Graham somehow managed to both play exactly 63 games in the 19/20 season), I used points per game because that’s the primary metric I’m using for this analysis. It also means that I would be able to make direct comparisons for more games, as there are inevitably going to be games where one team’s best player is out and the other is playing, giving the active player an advantage.
Now that the players are sorted, it’s time to discuss methodology. I took every game from the 2019/20 season, and subtracted the best player’s point total from their team’s final score. If changing the final score resulted in a tie (which can’t happen in the NBA), I looked at their assists and rebounds and then subtracted those values from the final scores. If that still was a tie, I took the best players’ blocks + steals, and then subtracted those values from the final scores to determine a winner. Essentially, if neither player’s contribution changed the outcome of the game, no points would be awarded to anyone. If the contribution of both players’ changed the outcome of the game, the now-losing team’s best player would get the point, as that meant that the absence of the formerly-winning, now-losing team was valuable enough that if they had not been playing, their team would have lost the match. If that sounds a little confusing, here are a couple examples that might help you visualize what this analysis does:
Scenario 1: The original final score of a match was Raptors 109, Nuggets 117. In that game, the Raptors’ best player, Pascal Siakam, scored 3 points, and the Nuggets’ best player, Nikola Jokic, scored 2 points. We subtract 3 from the Raptors’ score and 2 from the Nuggets’ score to get a new final score of Raptors 106, Nuggets 115. Because neither players’ points changed the outcome of the game, neither of them were truly “valuable” to their team— as even if they hadn’t played, the Nuggets would’ve still won regardless. Thus, no one gets a point, and the code moves on to the next match.
Scenario 2: The original final score of a match was Clippers 112, Lakers 113. In that game, the Clippers’ best player, Kawhi Leonard, scored 21 points, and the Lakers’ best player, Lebron James, scored 27 points. We subtract 21 from the Clippers’ score and 27 from the Lakers’ score to get a new final score of Clippers 91, Lakers 86. Because subtracting the best players’ points changed the outcome of the match (from a Clippers’ win to a Lakers’ win), we give a point to Lebron James.
Scenario 3: The original final score of a match was Hawks 99, Pistons 101. In that game, the Hawks’ best player, Trae Young, scored 30 points, and the Pistons’ best player, Derrick Rose, scored 32 points. We subtract 30 from the Hawks’ score and 32 from the Pistons’ score to get a new final score of Hawks 69, Pistons 69. This results in a tie, which means that we move onto assists+rebounds. Trae Young had an assists+rebound total of 11, while Derrick Rose had an assists+rebound total of 11 as well. We subtract 11 from both teams’ scores, and we end up with a new final score of Hawks 58, Pistons 58. Because it is still a tie, we go into blocks+steals. Trae Young had a blocks+steals total of 0, while Derrick Rose had a blocks+steals total of 2. We subtract 0 from the Hawks’ score and 2 from the Pistons’ score, giving us a new final score of Hawks 58, Pistons 56. Because the best players’ contributions changed the outcome of the match (from a Pistons’ win to a Hawks’ win), we award a point to Derrick Rose.
We could obviously do this for as many seasons as we want to, but I only went through the 2019-20 season for now (perhaps I’ll go back and simulate the other seasons in a later post). If you want to have a look at the messy, messy, no good, terrible, “one computer science data analysis class”-looking ass code that I used, you can find it below, as well as the NBA database that I utilized for this project. Now that we’re on the same page, here are the results, sorted by points:
Who’da thunk it, James Harden was very important for the Houston Rockets. Harden’s regular season contributions are regularly shadowed by his playstyle, so it’s not surprising that he is the MVP based on my parameters. Take into account that the Rockets’ playstyle revolves heavily around Harden isos and him generally scoring lots of points, and it makes sense that my evaluation, which values points per anything else, likes Harden a lot. Most surprising to me was Pascal Siakam at 3rd, above 30ppg scorer Damian Lillard and 26ppg scorer Devin Booker (Siakam averaged 22ppg). Siakam probably benefited from the fact that the Raptors had 53 total wins. Having a lot of wins in the first place gives you a huge edge in this analysis, as every win has potential to be converted into a point— a lost game can never give the losing team a point. Looking at the bottom of the scoreboard, this is evident; the players with the least MVP Points are also teams with the least wins.
Somewhat surprising is Ja Morant, who only has five MVP Points despite being the 2019-20 Rookie of the Year. Sure, the Grizzlies only won 34 games, but that puts them level with the Suns, whose best player, Devin Booker, managed to register 18 MVP Points, good for fifth overall. The Grizzlies are probably a more evenly scoring team, which would devalue Morant’s contributions in comparison to a Suns team that relies heavily on Booker’s points to win games. Booker is the only player whose team had a record under .500 in the top 10, which puts into context how insane he was individually despite a sub-par team. In contrast, Jayson Tatum belongs to a 48 win Celtics team, but only registered 11 MVP Points— surprising, given that the Celtics offense is pretty dependent on Tatum’s iso (though, perhaps they are dependent on isolation plays in general).
Let’s compare my results to the actual 2019-20 MVP voting, here’s the voting, per Basketball Reference:
We’re mostly in agreement, except they have both Lebron James and Anthony Davis in the running— something that my model can’t do, as it only allows one “best player” per team. If I replaced Lebron with AD, he ranks tied-fifth with Devin Booker at 18 MVP Points, with no substantial changes to any other player (Harden still first, Giannis still second). The primary problem with my methodology is that I choose each team’s “best player” ahead of time, meaning that there are only 30 MVP candidates, one for each team. When a team has multiple MVP candidates, as the Lakers do, my code completely ignores that possibility. Maybe in the future, I should allow the code to sequence every player’s value and return only those with five or more points— though I probably would have to optimize it so that it doesn’t iterate over the benchwarmers. Running my code takes less than 3 seconds, but that could easily balloon if I’m not careful.
Giannis got the lion’s share of the votes in the actual MVP voting, while Harden ranked third behind Lebron. In my model, Harden edges Giannis to first, while Lebron lags behind at 7th. There’s a subtle discrepancy between the MVP voting and my results, but nothing too egregious for me to consider that I fucked up my code somewhere or gave some aspect of the game too much weight. It is eye-opening, however, to see how important points are— all my homies (seemingly) hate defensive metrics. I guess I do integrate some defensive statistics (rebounds, steals, blocks) into my methodology, but they only count in tiebreakers, and I actually have no clue how many matches went to a tiebreak.
Tl;dr: when you’re looking at which teams lose the most games they already won if their best player’s points were absent, James Harden ranks first alongside Giannis Antetokounmpo, with Pascal Siakam a distant third. It aligns pretty well with the actual MVP voting, but my current code doesn’t allow for more than one player per team to be considered in the running. Maybe someday I’ll come back and do this with some other NBA seasons.
Here’s the code I used for the project (includes all the files I used), and here’s a link to the Kaggle where I sourced my data.