Howdy folks! We're just past the 25% mark in baseball's regular season, which means it's time to start digging in a little deeper into the analytics behind the result stats for our fantasy baseball players. If you followed us in the pre-season, you might remember my article on using regression analysis for HR/FB rate scrutiny. The results from that study have been extremely useful this season, including calls for unfavorable regression in HR/FB rate for players like Rhys Hoskins, Domingo Santana, and Willson Contreras, while also calling favorable regression in HR/FB rate on players like Jed Lowrie, Manny Machado, and Adrian Gonzalez. Sure, there have been misses such as Tommy Pham and Nick Castellanos, but for the most part, if you followed the advice in that column on draft day, you're probably feeling pretty good in the power department right now.
Since I love numbers and I'm always looking for ways to improve my sabermetric models, I'm excited to announce a few changes to my HR/FB regression model this time around. Off the bat, I'm now including Launch Angle and Pull %, two indicators that have strong correlations with a player's ability to hit one out of the park, as well as swapping out total barrels for Brls/BBE (the percentage of barrels hit per batted ball event). Since players are constantly coming and going on the disabled list and up from the minor leagues, relying on a rate stat like Brls/BBE as opposed to total barrels prevents those players' expected HR/FB rates from being artificially suppressed.
This time around, I included all players with more than 50 plate appearances on the season within my sample and the regression analysis determined a correlation rate between the five statistics (hard%, z-contact%, Brls/BBB, LA, & Pull%) to be 0.55. I should mention, this is slightly lower than the 0.61 correlation I found between hard%, z-contact and barrels in the pre-season, but still a relatively strong correlation. As previously mentioned, I felt the changes I made to the model this time around more accurately represented all players in the sample, despite a slightly lower correlation rate.
So where are the players with HR/FB rates lagging behind their indicator stats? How about the players out-performing their underlying numbers? Have a look below to see if anyone is on one of your rosters.
Note: Just a reminder, this type of regression model is not designed to be predictive in nature. However, it does accurately represent what a player's actual HR/FB rate should have been based on the actual statistics already logged this season. Assuming no other changes, it's reasonable to use this information to find players who may regress back closer to where their indicator statistics suggest the HR/FB rate should lie.