Sabermetric Tuesday - Variability in Weekly Head-to-Head Match-Ups
Welcome to Fantasy Week 8's edition of Sabermetric Tuesday. Every week, we aim to introduce some unique ways at looking at statistics to help you make educated decisions about your fantasy team. Hopefully our columns have helped propel you to first place (or at least put you in position to make a nice comeback).
Popular fantasy press often focuses on rotisserie style leagues and a player's value to overall categories throughout an entire season. Additionally, I like to pay attention and write about day-to-day lineup decision making. Should I play this player today? Will so-and-so be in tonight's lineup or not? Is this guy a good pitch-and-ditch candidate for Sunday? These are decisions that are especially valuable in head-to-head leagues.
But what about a player's consistency on a weekly basis? From Monday to Sunday, a lot can happen in a player's performance that can determine your team's success for that week. If you have Alfonso Soriano on your team, you benefited from his stellar performance in Week 7: 16 hits, 7 HR, 14 RBI, and 10 runs scored. On the flip-side, maybe you had the slumping Curtis Granderson who had just 4 hits, no extra bases, 2 RBI, and 1 run scored. How can we handle this extreme variability from week-to-week? Is it possible to smooth our players' overall results as evenly as possible over a typical 22-week fantasy season?
Predicting a players variance from week to week is near impossible. Slumps, rainouts, and injuries all contribute to a player's performance day-in and day-out. However, today's First Pitch will show you the players in 2008 who have been most and least consistent.
A Quick Statistics 101 Lesson
One way to measure a player's streakiness and consistency is to calculate variances and standard deviations for each offensive category and compare across a certain sample. Variance measures dispersion in a sample and averages the distance away from the average. For example, if a player consistently hits 10 home runs every week, his variance would be zero. In any given week, he does not vary from his average 10 home runs hit per year. Of course, taking the difference between the individual weeks, summing their squares, and dividing by the total sample size gives us the average of the squared differences between data points and the mean. Confusing right? The official textbook formula can be found here.
Or, =VAR in Excel can do the same thing.
Thankfully, we have the standard deviation to take care of the confusion for us. Variance is tabulated in units squared. Standard deviation is the square root of that number and measures the spread of data
around the mean (or average). It is always measured in the same units as the data. So if a player has a weekly HR standard deviation of 1.0, it can be read that, on average, a player strays from his home run mean on a weekly basis by 1 HR. Make sense? Maybe some data will make it clearer. Nobody said this was going to be an easy read. Again, as a textbook reference, the standard deviation formula (also =STDEV in Excel) can be found here. The Methodology
1) The sample is the top-200 players sorted by ABs. In theory, this player universe should capture players who have been everyday players and who have been fixtures in lineups since the start of the season. Part-time players and DL-stints would cause too much noise in the sample and cause an inaccurate portrayal.
2) Means, Variances, and Standard deviations were calculated for the following offensive categories: Runs, Hits, Home Runs, and RBI.
3) The data sample range is from the start of the 2008 season through Sunday, May 18.
4) The cumulative YTD and Weekly Mean statistics can be found here if you're looking for the full data of the sample.
The Data
Let's first take a look the Top-10 Most & Least Variable in Runs scored:
Player |
Most Run Variability |
Rafael Furcal |
3.8 |
Conor Jackson |
3.7 |
Alfonso Soriano |
3.5 |
Magglio Ordonez |
3.1 |
Johnny Damon |
2.9 |
AJ Pierzynski |
2.9 |
Jayson Werth |
2.7 |
Derek Jeter |
2.6 |
Ryan Theriot |
2.6 |
Gerald Laird |
2.6 |
Player |
Least Run Variability |
Adam Jones |
0.4 |
Hideki Matsui |
0.5 |
Mike Lamb |
0.6 |
Geoff Jenkins |
0.7 |
Pedro Feliz |
0.8 |
James Loney |
0.8 |
Melvin Mora |
0.8 |
Stephen Drew |
0.8 |
Justin Morneau |
0.8 |
Casey Kotchman |
0.8 |
In other words, Rafael Furcal has varied the most from his average production on a week-to-week basis for number of runs scored from Monday to Sunday. The second half of the table indicates that Adam Jones varies the least from his overall average runs scored on a week-to-week basis. Now that you know how to read the charts, I will list the remaining offensive categories. Remember, you can find the full YTD and weekly average data at this link if you would like to compare these standard deviations on a relative basis to see if these players could be viable fantasy candidates for your lineup.
Top-10 Most & Least Variable in Hits:
Player |
Most Hits Variability |
Alfonso Soriano * |
5.8 |
Rafael Furcal |
4.5 |
Chone Figgins |
4.3 |
Derek Jeter |
4.2 |
Kevin Youkilis |
4.0 |
Fred Lewis |
3.9 |
Jermaine Dye |
3.9 |
Garret Anderson |
3.8 |
Lance Berkman * |
3.8 |
Edgar Renteria |
3.8 |
Player |
Least Hits Variability |
Jorge Cantu |
0.8 |
Nick Swisher |
1.0 |
Mark Teahen |
1.1 |
Nick Markakis |
1.1 |
Chris B. Young |
1.1 |
Geoff Jenkins |
1.1 |
Orlando Cabrera |
1.1 |
Grady Sizemore |
1.2 |
Eric Hinske |
1.3 |
Yadier Molina |
1.3 |
* I think its safe to say that while these two players are high on the variability list, their out-of-this-world performances in a single week largely drive a deviation from their means.
Top-10 Most & Least Variable in Home Runs:
As a side note, you needed to have at least 5 HR to make the list and on least variable, there was a large sample tied at 0.7 so I included the three players with the highest home run total of the 0.7 crowd. This figure should get more statistically interesting as the season progresses. I will surely update the list post all-star break.
Top-10 Most & Least Variable in RBI:
A lot of first round draft picks at the top of this variability list. Most of them had slow starts and just recently broke-out with huge weeks, leading to high variances in RBI production. This should be another interesting trend to see how it smooths out over the season, but this is a great starting point. Have a great Tuesday! -- JoeTop-10 Most & Least Variable in Home Runs:
Player |
Most Home Run Variability |
Alfonso Soriano |
2.5 |
Ryan Braun |
2.0 |
Chase Utley |
1.8 |
Ryan Ludwick |
1.7 |
Kevin Youkilis |
1.7 |
Jayson Werth |
1.6 |
Dan Uggla |
1.6 |
Lance Berkman |
1.5 |
Brandon Phillips |
1.5 |
Raul Ibanez |
1.4 |
Player |
Least Home Run Variability |
Adrian Beltre |
0.4 |
Melvin Mora |
0.5 |
Miguel Tejada |
0.5 |
Bobby Abreu |
0.5 |
Rafael Furcal |
0.5 |
Eric Hinske |
0.6 |
Brian McCann |
0.6 |
Nick Markakis |
0.7 |
Geovany Soto |
0.7 |
Edwin Encarnacion |
0.7 |
As a side note, you needed to have at least 5 HR to make the list and on least variable, there was a large sample tied at 0.7 so I included the three players with the highest home run total of the 0.7 crowd. This figure should get more statistically interesting as the season progresses. I will surely update the list post all-star break.
Top-10 Most & Least Variable in RBI:
Player |
Most RBI Variability |
Alfonso Soriano |
4.9 |
Jayson Werth |
4.4 |
Jose Guillen |
4.1 |
Miguel Cabrera |
3.9 |
Matt Kemp |
3.9 |
Lance Berkman |
3.6 |
Ryan Braun |
3.5 |
David Ortiz |
3.5 |
Matt Holliday |
3.4 |
Magglio Ordonez |
3.4 |
Player |
Lease RBI Variability |
Geoff Jenkins |
0.6 |
Michael Bourn |
0.6 |
Juan Uribe |
0.7 |
Delmon Young |
0.8 |
JJ Hardy |
0.8 |
Mark Grudzielanek |
0.8 |
Ichiro Suzuki |
0.8 |
Chone Figgins |
0.8 |
Andruw Jones |
0.8 |
Mike Jacobs |
0.8 |
Rod Steves
May 19, 08 at 09:34 PM
Thanks for the variance stats, they were very helpful.
Rod Steves
May 19, 08 at 09:34 PM
Thanks for the variance stats, they were very helpful.
Jay
May 19, 08 at 09:34 PM
Nice stuff, and you really went beyond the call of duty by posting a link to the raw data. That's really appreciated!
Jason Welty
May 19, 08 at 09:34 PM
Excellent Post. It's been a long time since I looked at standard deviations. My grad school stats teacher would be proud.
I think it would be helpful if you grouped the variance stats by total production too. I don't think it is useful to know that G. Jenkins is statistically least variable for RBI, when he only has 7 on the year. Jenkins makes a couple different lists, which tells us he is consistently bad!
I know the point of the post wasn't to demonstrate who has better statistics, but rather who is most stable. However, I think it makes your post better better if you compare similar outputs and then compare the stability.