Welcome to last pre-All Star break Sabermetric Tuesday of 2008. Today's topic was sparked by a friendly baseball debate during a 4th of July BBQ this past weekend. Living in Chicago, I watched Kerry Wood's blown save to the Cardinals with about a dozen Cub fans. The situation prompted one guy to declare the seemingly obvious theory that closers for good teams get more save opportunities. He said something like, "well, Kerry Wood has gotten lots of save opportunities this year because he plays for a winning team." I started a quick rebuttal to the aforementioned claim by citing such popular fantasy closers such as Joakim Soria, Brian Wilson, and George Sherrill, but wasn't confident that there were enough names to counteract the closers who play for high winning percentage teams like Mariano Rivera, Jonathan Papelbon, and Brad Lidge.
Lots of analysis and numerous articles have been written on the value of fantasy closers. I decided to tackle the task for myself of proving, once and for all, that there is not a significant correlation between the number of save opportunities a closer is handed each year versus the winning percentage of a team. This would help "put to bed" the draft-day strategy (or mid-season trade theory) that high value closers need to play for winning teams to earn a large volume of save opportunities.
Setting Up the Analysis
Let's take a brief look at the correlation statistic and how it works. Open up any statistics text book and you will find a simple definition that says correlation is the strength of the relationship between two variables. In this case, variable #1 is defined as save opportunities and variable #2 is corresponding team winning percentage. The actual formula to calculate correlation is much more complex than the definition:
r = (N∑xiyi -∑xi∑yi) / [(N∑x2 - (∑xi)2)(N∑y2 - (∑yi)2)]0.5
where N = number of entries, x is variable 1, and y is variable 2.
where N = number of entries, x is variable 1, and y is variable 2.
Looks complicated right? It isn't nearly as difficult as it looks, but luckily we have the =CORREL function in Excel to easily calculate it for us. After calculating the correlation, how do we think about "r"? The following chart is a general rule of thumb to show strength of correlation of r:
Correlation |
Negative |
Positive |
Small |
-0.3 to -0.1 |
0.1 to 0.3 |
Medium |
-0.5 to -0.3 |
0.3 to 0.5 |
Large |
-1.0 to -0.5 |
0.5 to 1.0 |
So, my hypothesis above would indicate that there is small positive correlation between the two defined variables. Let's run the numbers and see what we get.
The Sample
There are 30 MLB teams and, in theory, one pitcher assumes the closer duties for each club. Using publicly available information, the sample mean was determined by taking the top 30 pitchers in save opportunities per season from 2002 through 2007 (omitting two due to mid-season trades between teams during the sample time period) and the sample size resulted in N = 178. All corresponding team winning percentages were accumulated for each individual player entry. I also included some performance statistics for reference purposes. I also provided the master data.
And The Number Is....
0.327
In other words, there is a small-to-medium positive correlation between save opportunities and team winning percentage. The result is small enough to not completely reject the hypothesis and further analysis would need to be done to truly disprove any sort of significant correlation between the two variables. So, I think its safe to say that the effect of a team's overall ability to win games is not necessarily an accurate predictor of save opportunities. Below is a graphical representation of the data, also showing slight positive correlation without significance.
So why is the number small? One theory is that a team's propensity to play in close games (and not outscore the opposing team by more than 3 runs in a large number of games) would be more important than overall winning percentage. After all, if a large number of a team's wins are in non-save situations, then the closer will not see a save opportunity.
There are a few other effects that came to mind as well: does a manager have confidence in his closer to pitch in 3 consecutive games? How effective is the middle-relief bridge between starters and the closer? Does a team's deep bench effect runs scored late in games and if so, does that produce more save opportunities or more blow-outs that eliminate the save potential?
If nothing else, this rudimentary look into save opportunities hopefully provides a little insight into trade and draft strategies when it comes to closers. Have a great Tuesday. -- Joe