Many runners use the heart rate (HR) data to plan their training strategies. Heart rate is an individual metric and differs between athletes running the same pace, therefore, it can help them to pace themselves properly and can be a useful metric to gauge fatigue and fitness level. For many experienced runners, it would be important to analyze your historical workouts, explore your heart rate range variation and check the ellapsed time through each training zone in order to ensure that you are running at the right effort level to maximize your workout. In this notebook we will present how we can explore our heart rate (HR) records and the respective training zones based on the data provided from the smartwatches or tracking running apps.
The easiest way resorts to a empirical equation, HRmax = 220 - Age
. It is drawn from some epidemiological studies; thus, may not personalized. A better way is to monitor HR while running uphill intervals or use your historical data to compute your HRmax. In this scenario let's use the empirical HRmax. The current runner's age is 36, so my max heart rate would be 184 (HRmax = 220-36).
As we explained above, the heart rate is grouped by zones: active recovery (easy running), general aerobic (marathon pace), basic endurance or the lactate threshold zone (tempo run) and the anaerobic or the VO2 max work(interval). For the maximum heart rate of 184, based on the Garmin's thresholds, the zones can be calculated as:
Zone | heart_rate | |
---|---|---|
Z1 (Active recovery) | 110 | |
Z2 (Endurance) | 129 | |
Z3 (Tempo) | 147 | |
Z4 (Threshold) | 166 | |
Z5 (VO2 Max) | 184 |
Now that we have the training zones, by using runpandas we could play with our workouts and evaluate the quality of the workouts based on its training zones. In this example, I selected one of workouts to further explore these zones and how they correlate with the sensors data available recorded.
import warnings
warnings.filterwarnings('ignore')
import runpandas
activity = runpandas.read_file('./11km.tcx')
print('Start', activity.index[0],'End:', activity.index[-1])
print('Distance in km:', activity.distance / 1000)
Start 0 days 00:00:00 End: 0 days 01:16:06 Distance in km: 11.035920898
First, let's perform a QC evaluation on the data, to check if there's any invalid or missing data required for the analysis. As you can see in the cell below, there are 5 records with heart rate data missing. We will replace all these with the first HR sensor data available.
import numpy as np
group_hr = activity['hr'].isnull().sum()
print("There are nan records: %d" % group_hr)
#There is 5 missing values in HR. Let's see the positions where they are placed in the frame.
print(activity[activity['hr'].isnull()])
#We will replace all NaN values with the first HR sensor data available
activity['hr'].fillna(activity.iloc[5]['hr'], inplace=True)
print('Total nan after fill:', activity['hr'].isnull().sum())
There are nan records: 5 run_cadence alt dist hr lon lat \ time 00:00:00 NaN 668.801819 0.000000 NaN -36.577568 -8.364486 00:00:07 NaN 668.714722 5.749573 NaN -36.577465 -8.364492 00:00:10 NaN 668.680603 11.615299 NaN -36.577423 -8.364470 00:00:12 83.0 668.639099 17.306795 NaN -36.577366 -8.364449 00:00:15 82.0 668.600464 22.672394 NaN -36.577312 -8.364429 speed time 00:00:00 0.000000 00:00:07 0.000000 00:00:10 0.000000 00:00:12 2.262762 00:00:15 2.317986 Total nan after fill: 0
Let's see how to add a column with the heart rate zone label to the data frame. For this task, we will use the special method runpandas.compute.heart_zone
. The parameters are the bins argument which contains the left and right bounds for each training zone and the labels argument corresponding to the zone labels
activity['heartrate_zone'] = activity.compute.heart_zone(
labels=["Rest", "Z1", "Z2", "Z3", "Z4", "Z5"],
bins=[0, 92, 110, 129, 147, 166, 184])
activity["heartrate_zone"].head()
time 00:00:00 Z1 00:00:07 Z1 00:00:10 Z1 00:00:12 Z1 00:00:15 Z1 Name: heartrate_zone, dtype: category Categories (6, object): [Rest < Z1 < Z2 < Z3 < Z4 < Z5]
To calculate the time in zone, there is also a special method runpandas.compute.time_in_zone
which computes the time spent for each training zone.
time_in_zone = activity.compute.time_in_zone(
labels=["Rest", "Z1", "Z2", "Z3", "Z4", "Z5"],
bins=[0, 92, 110, 129, 147, 166, 184])
time_in_zone
hr_zone Rest 00:00:00 Z1 00:04:10 Z2 00:07:05 Z3 00:31:45 Z4 00:33:06 Z5 00:00:00 Name: time_diff, dtype: timedelta64[ns]