Throughout the baseball history, the management guys of ball-clubs have tried to evaluate baseball players, and they created some statistics such as 'Average (타율)', 'Runs Batted In (RBI, 타점)', 'Earned Run Average (ERA, 평균자책점)' and so on. But these have some limitations in measuring the "real value" of the players, so some complicated mathematical stuffs came in and a new approach called 'SABERMETRICS' was introduced.
We will first download a dataset describing the starting pitchers' statistics from the MLB season 2019. For your information, these can be obtained from fangraphs.com.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('https://jonghank.github.io/ase1302/files/FanGraphs_Leaderboard_2019.csv')
df
Name | Team | W | L | ERA | G | GS | CG | ShO | SV | HLD | BS | IP | TBF | H | R | ER | HR | BB | IBB | HBP | WP | BK | SO | playerid | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Hyun-Jin Ryu | Dodgers | 14 | 5 | 2.32 | 29 | 29 | 1 | 1 | 0 | 0 | 0 | 182.2 | 723 | 160 | 53 | 47 | 17 | 24 | 2 | 4 | 0 | 0 | 163 | 14444 |
1 | Jacob deGrom | Mets | 11 | 8 | 2.43 | 32 | 32 | 0 | 0 | 0 | 0 | 0 | 204.0 | 804 | 154 | 59 | 55 | 19 | 44 | 1 | 7 | 2 | 0 | 255 | 10954 |
2 | Gerrit Cole | Astros | 20 | 5 | 2.50 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 212.1 | 817 | 142 | 66 | 59 | 29 | 48 | 0 | 3 | 4 | 3 | 326 | 13125 |
3 | Justin Verlander | Astros | 21 | 6 | 2.58 | 34 | 34 | 2 | 1 | 0 | 0 | 0 | 223.0 | 847 | 137 | 66 | 64 | 36 | 42 | 0 | 6 | 4 | 0 | 300 | 8700 |
4 | Mike Soroka | Braves | 13 | 4 | 2.68 | 29 | 29 | 0 | 0 | 0 | 0 | 0 | 174.2 | 701 | 153 | 56 | 52 | 14 | 41 | 1 | 7 | 3 | 0 | 142 | 18383 |
5 | Jack Flaherty | Cardinals | 11 | 8 | 2.75 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 196.1 | 772 | 135 | 62 | 60 | 25 | 55 | 2 | 7 | 6 | 0 | 231 | 17479 |
6 | Sonny Gray | Reds | 11 | 8 | 2.87 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 175.1 | 708 | 122 | 59 | 56 | 17 | 68 | 1 | 7 | 7 | 1 | 205 | 12768 |
7 | Max Scherzer | Nationals | 11 | 7 | 2.92 | 27 | 27 | 0 | 0 | 0 | 0 | 0 | 172.1 | 693 | 144 | 59 | 56 | 18 | 33 | 2 | 7 | 0 | 0 | 243 | 3137 |
8 | Zack Greinke | - - - | 18 | 5 | 2.93 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 208.2 | 810 | 175 | 73 | 68 | 21 | 30 | 2 | 4 | 2 | 1 | 187 | 1943 |
9 | Clayton Kershaw | Dodgers | 16 | 5 | 3.05 | 28 | 28 | 0 | 0 | 0 | 0 | 0 | 177.1 | 703 | 145 | 63 | 60 | 28 | 41 | 0 | 2 | 7 | 1 | 188 | 2036 |
10 | Charlie Morton | Rays | 16 | 6 | 3.05 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 194.2 | 790 | 154 | 71 | 66 | 15 | 57 | 0 | 12 | 5 | 1 | 240 | 4676 |
11 | Marcus Stroman | - - - | 10 | 13 | 3.22 | 32 | 32 | 0 | 0 | 0 | 0 | 0 | 184.1 | 774 | 183 | 77 | 66 | 18 | 58 | 1 | 1 | 7 | 0 | 159 | 13431 |
12 | Patrick Corbin | Nationals | 14 | 7 | 3.25 | 33 | 33 | 1 | 1 | 0 | 0 | 0 | 202.0 | 835 | 169 | 81 | 73 | 24 | 70 | 2 | 3 | 4 | 0 | 238 | 9323 |
13 | Walker Buehler | Dodgers | 14 | 4 | 3.26 | 30 | 30 | 2 | 0 | 0 | 0 | 0 | 182.1 | 737 | 153 | 77 | 66 | 20 | 37 | 0 | 7 | 4 | 0 | 215 | 19374 |
14 | Shane Bieber | Indians | 15 | 8 | 3.26 | 33 | 33 | 3 | 2 | 0 | 0 | 0 | 212.1 | 851 | 184 | 85 | 77 | 31 | 40 | 1 | 5 | 6 | 1 | 257 | 19427 |
15 | Stephen Strasburg | Nationals | 18 | 6 | 3.32 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 209.0 | 841 | 161 | 79 | 77 | 24 | 56 | 4 | 10 | 8 | 0 | 251 | 10131 |
16 | Dakota Hudson | Cardinals | 16 | 7 | 3.36 | 32 | 32 | 0 | 0 | 0 | 0 | 0 | 174.0 | 754 | 160 | 80 | 65 | 22 | 85 | 8 | 9 | 5 | 0 | 136 | 19206 |
17 | Luis Castillo | Reds | 15 | 8 | 3.40 | 32 | 32 | 0 | 0 | 0 | 0 | 0 | 190.2 | 781 | 139 | 76 | 72 | 22 | 79 | 0 | 7 | 5 | 3 | 226 | 15689 |
18 | Lucas Giolito | White Sox | 14 | 9 | 3.41 | 29 | 29 | 3 | 2 | 0 | 0 | 0 | 176.2 | 705 | 131 | 69 | 67 | 24 | 57 | 1 | 4 | 6 | 0 | 228 | 15474 |
19 | Kyle Hendricks | Cubs | 11 | 10 | 3.46 | 30 | 30 | 1 | 1 | 0 | 0 | 0 | 177.0 | 730 | 168 | 78 | 68 | 19 | 32 | 1 | 9 | 1 | 0 | 150 | 12049 |
20 | Jeff Samardzija | Giants | 11 | 12 | 3.52 | 32 | 32 | 0 | 0 | 0 | 0 | 0 | 181.1 | 740 | 152 | 78 | 71 | 28 | 49 | 4 | 6 | 5 | 0 | 140 | 3254 |
21 | Mike Minor | Rangers | 14 | 10 | 3.59 | 32 | 32 | 2 | 1 | 0 | 0 | 0 | 208.1 | 863 | 190 | 86 | 83 | 30 | 68 | 1 | 7 | 2 | 0 | 200 | 10021 |
22 | Lance Lynn | Rangers | 16 | 11 | 3.67 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 208.1 | 875 | 195 | 89 | 85 | 21 | 59 | 0 | 8 | 18 | 0 | 246 | 2520 |
23 | Jose Berrios | Twins | 14 | 8 | 3.68 | 32 | 32 | 1 | 0 | 0 | 0 | 0 | 200.1 | 842 | 194 | 94 | 82 | 26 | 51 | 0 | 9 | 8 | 1 | 195 | 14168 |
24 | Eduardo Rodriguez | Red Sox | 19 | 6 | 3.81 | 34 | 34 | 0 | 0 | 0 | 0 | 0 | 203.1 | 859 | 195 | 88 | 86 | 24 | 75 | 2 | 7 | 3 | 0 | 213 | 13164 |
25 | Julio Teheran | Braves | 10 | 11 | 3.81 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 174.2 | 754 | 148 | 81 | 74 | 22 | 83 | 3 | 14 | 5 | 1 | 162 | 6797 |
26 | Anibal Sanchez | Nationals | 11 | 8 | 3.85 | 30 | 30 | 0 | 0 | 0 | 0 | 0 | 166.0 | 712 | 153 | 77 | 71 | 22 | 58 | 10 | 4 | 1 | 3 | 134 | 3284 |
27 | Aaron Nola | Phillies | 12 | 7 | 3.87 | 34 | 34 | 0 | 0 | 0 | 0 | 0 | 202.1 | 852 | 176 | 91 | 87 | 27 | 80 | 3 | 11 | 3 | 0 | 229 | 16149 |
28 | Sandy Alcantara | Marlins | 6 | 14 | 3.88 | 32 | 32 | 2 | 2 | 0 | 0 | 0 | 197.1 | 838 | 179 | 94 | 85 | 23 | 81 | 5 | 8 | 4 | 1 | 151 | 18684 |
29 | Brett Anderson | Athletics | 13 | 9 | 3.89 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 176.0 | 743 | 181 | 80 | 76 | 20 | 49 | 2 | 4 | 4 | 0 | 90 | 8223 |
30 | Anthony DeSclafani | Reds | 9 | 9 | 3.89 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 166.2 | 696 | 151 | 77 | 72 | 29 | 49 | 5 | 4 | 2 | 1 | 167 | 13050 |
31 | Mike Fiers | Athletics | 15 | 4 | 3.90 | 33 | 33 | 1 | 1 | 0 | 0 | 0 | 184.2 | 754 | 166 | 82 | 80 | 30 | 53 | 0 | 9 | 13 | 1 | 126 | 7754 |
32 | Madison Bumgarner | Giants | 9 | 9 | 3.90 | 34 | 34 | 0 | 0 | 0 | 0 | 0 | 207.2 | 844 | 191 | 99 | 90 | 30 | 43 | 3 | 10 | 3 | 0 | 203 | 5524 |
33 | Zack Wheeler | Mets | 11 | 8 | 3.96 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 195.1 | 828 | 196 | 93 | 86 | 22 | 50 | 4 | 2 | 5 | 0 | 195 | 10310 |
34 | Yu Darvish | Cubs | 6 | 8 | 3.98 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 178.2 | 731 | 140 | 82 | 79 | 33 | 56 | 1 | 11 | 11 | 0 | 229 | 13074 |
35 | Wade Miley | Astros | 14 | 6 | 3.98 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 167.1 | 720 | 164 | 83 | 74 | 23 | 61 | 0 | 5 | 4 | 2 | 140 | 8779 |
36 | Marco Gonzales | Mariners | 16 | 13 | 3.99 | 34 | 34 | 0 | 0 | 0 | 0 | 0 | 203.0 | 866 | 210 | 106 | 90 | 23 | 56 | 1 | 6 | 2 | 1 | 147 | 15467 |
37 | Miles Mikolas | Cardinals | 9 | 14 | 4.16 | 32 | 32 | 1 | 1 | 0 | 0 | 0 | 184.0 | 764 | 193 | 90 | 85 | 27 | 32 | 1 | 12 | 5 | 2 | 144 | 9803 |
38 | Joey Lucchesi | Padres | 10 | 10 | 4.18 | 30 | 30 | 0 | 0 | 0 | 0 | 0 | 163.2 | 686 | 144 | 78 | 76 | 23 | 56 | 0 | 2 | 8 | 0 | 158 | 19320 |
39 | Brad Keller | Royals | 7 | 14 | 4.19 | 28 | 28 | 0 | 0 | 0 | 0 | 0 | 165.1 | 709 | 154 | 80 | 77 | 15 | 70 | 2 | 9 | 9 | 1 | 122 | 15734 |
40 | Adam Wainwright | Cardinals | 14 | 10 | 4.19 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 171.2 | 745 | 181 | 83 | 80 | 22 | 64 | 7 | 8 | 2 | 0 | 153 | 2233 |
41 | Noah Syndergaard | Mets | 10 | 8 | 4.28 | 32 | 32 | 1 | 1 | 0 | 0 | 0 | 197.2 | 825 | 194 | 101 | 94 | 24 | 50 | 2 | 6 | 4 | 0 | 202 | 11762 |
42 | Mike Leake | - - - | 12 | 11 | 4.29 | 32 | 32 | 2 | 1 | 0 | 0 | 0 | 197.0 | 835 | 227 | 114 | 94 | 41 | 27 | 2 | 10 | 2 | 0 | 127 | 10130 |
43 | Robbie Ray | Diamondbacks | 12 | 8 | 4.34 | 33 | 33 | 0 | 0 | 0 | 0 | 0 | 174.1 | 747 | 150 | 91 | 84 | 30 | 84 | 5 | 5 | 7 | 0 | 235 | 11486 |
44 | Tanner Roark | - - - | 10 | 10 | 4.35 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 165.1 | 722 | 180 | 84 | 80 | 28 | 51 | 1 | 13 | 3 | 0 | 158 | 8753 |
45 | Merrill Kelly | Diamondbacks | 13 | 14 | 4.42 | 32 | 32 | 0 | 0 | 0 | 0 | 0 | 183.1 | 777 | 184 | 95 | 90 | 29 | 57 | 4 | 2 | 4 | 0 | 158 | 11156 |
46 | Jon Lester | Cubs | 13 | 10 | 4.46 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 171.2 | 764 | 205 | 101 | 85 | 26 | 52 | 0 | 5 | 3 | 0 | 165 | 4930 |
47 | Masahiro Tanaka | Yankees | 11 | 8 | 4.47 | 31 | 31 | 1 | 1 | 0 | 0 | 0 | 179.0 | 744 | 181 | 93 | 89 | 28 | 39 | 0 | 2 | 7 | 0 | 147 | 15764 |
48 | Trevor Bauer | - - - | 11 | 13 | 4.48 | 34 | 34 | 1 | 1 | 0 | 0 | 0 | 213.0 | 911 | 184 | 118 | 106 | 34 | 82 | 0 | 19 | 10 | 0 | 253 | 12703 |
49 | Joe Musgrove | Pirates | 11 | 12 | 4.49 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 168.1 | 712 | 168 | 98 | 84 | 21 | 39 | 1 | 9 | 2 | 0 | 156 | 12970 |
50 | Matthew Boyd | Tigers | 9 | 12 | 4.56 | 32 | 32 | 0 | 0 | 0 | 0 | 0 | 185.1 | 788 | 178 | 101 | 94 | 39 | 50 | 1 | 8 | 6 | 4 | 238 | 15440 |
51 | Homer Bailey | - - - | 13 | 9 | 4.57 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 163.1 | 696 | 162 | 84 | 83 | 21 | 53 | 1 | 4 | 4 | 1 | 149 | 8362 |
52 | Ivan Nova | White Sox | 11 | 12 | 4.72 | 34 | 34 | 2 | 0 | 0 | 0 | 0 | 187.0 | 806 | 225 | 107 | 98 | 30 | 47 | 1 | 9 | 7 | 0 | 114 | 1994 |
53 | German Marquez | Rockies | 12 | 5 | 4.76 | 28 | 28 | 1 | 1 | 0 | 0 | 0 | 174.0 | 721 | 174 | 96 | 92 | 29 | 35 | 0 | 5 | 14 | 1 | 175 | 15038 |
54 | Jose Quintana | Cubs | 13 | 9 | 4.80 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 167.0 | 725 | 186 | 98 | 89 | 20 | 43 | 0 | 2 | 11 | 0 | 144 | 11423 |
55 | Jakob Junis | Royals | 9 | 14 | 5.24 | 31 | 31 | 0 | 0 | 0 | 0 | 0 | 175.1 | 771 | 192 | 108 | 102 | 31 | 58 | 1 | 11 | 4 | 0 | 164 | 13619 |
56 | Reynaldo Lopez | White Sox | 10 | 15 | 5.38 | 33 | 33 | 1 | 0 | 0 | 0 | 0 | 184.0 | 809 | 203 | 119 | 110 | 35 | 65 | 0 | 8 | 5 | 2 | 169 | 16400 |
57 | Rick Porcello | Red Sox | 14 | 12 | 5.52 | 32 | 32 | 0 | 0 | 0 | 0 | 0 | 174.1 | 768 | 198 | 114 | 107 | 31 | 45 | 2 | 6 | 5 | 2 | 143 | 2717 |
One of the most famous stats in sabermetrics is the 'Fielding Independent Pitching (FIP, 수비무관 평균자책점)'. In a ball game, earned runs are not only decided by the pitching skills, but also decided by a lot of other factors such as the size of the field and the fielding skills of position players. Hence, sabermetricians who deal with sabermetrics pay attention to the position-player-independent factors. For example, the strike-outs or the base-on-balls are independent of the performances of position players, but only a pitcher is involved. So the FIP can be represented by using these independent factors as follows.
(Problem 1) Calculate the FIP of the pitchers using the above formula, and append it as a new column on your dataframe. Display the results on a plot.
# your code here
(Problem 2) The MLB is divided into two leagues; the National League(NL) and the American League(AL). By running below cell, you can get the lists made of the NL and the AL teams.
AL = ['Yankees', 'Rays', 'Red Sox', 'Blue Jays', 'Orioles', 'Twins', 'Indians', 'White Sox', 'Royals', 'Tigers', \
'Astros', 'Athletics', 'Rangers', 'Angels', 'Mariners']
NL = ['Braves', 'Nationals', 'Mets', 'Phillies', 'Marlins', 'Cardinals', 'Brewers', 'Cubs', 'Reds', 'Pirates', 'Dodgers', \
'Diamondbacks', 'Giants', 'Padres', 'Rockies']
For each league, list the top five pitchers with lowest FIPs. You may ignore the players with "- - -" on the "Team" field.
# your code here
AL top 5 pitchers for FIP: ['Gerrit Cole' 'Charlie Morton' 'Lance Lynn' 'Justin Verlander' 'Shane Bieber'] NL top 5 pitchers for FIP: ['Max Scherzer' 'Jacob deGrom' 'Walker Buehler' 'Hyun-Jin Ryu' 'Stephen Strasburg']
(Problem 3) Every year, the Cy Young Award is given to the most outstanding pitcher in each league of MLB. The award was introduced in 1956 by Commissioner Ford Frick and approved by the Baseball Writers Association of America. The award is named in honor of Hall of Fame pitcher Cy Young who died a year earlier in 1955. Based on the given dataset, we would like to anticipate who the recipient will be. To predict winners, we can use the following two predictors.
By using the above two models, predict the future Cy Young Award recipients from each league.
# your code here
By Tom Tango`s model, Gerrit Cole in AL and Jacob deGrom in NL will win the Cy Young Award By ESPN`s model, Justin Verlander in AL and Stephen Strasburg in NL will win the Cy Young Award
The list of the Cy Young Award winners, including the season 2019, can be found here: https://www.baseball-reference.com/bullpen/Cy_Young_Award.
Check if your predictors made reasonable predictions.