![]() |
---|
Image Generated Using Canva |
Time series data is everywhere; stock prices, weather reports, sensor data, website traffic. If you can master how to work with time, you can unlock powerful insights and predict the future.
Pandas makes it incredibly intuitive to handle time-indexed data, re-sample it, and prep it for machine learning models.
In this article, let’s explore Pandas’s core time series operations, laying the groundwork for real-world forecasting models.
DateTimeIndex
supercharges time-aware operationsresample() vs. asfreq()
Pandas can automatically parse date columns and even infer frequencies, making it possible to clean and analyze time data with just a few lines of code.
import pandas as pd
import numpy as np
# Create time series data
dates = pd.date_range(start="2023-01-01", periods=10, freq="D")
data = np.random.randint(100, 200, size=(10,))
df = pd.DataFrame({'value': data}, index=dates)
# Resample to every 3 days and get the mean
resampled = df.resample('3D').mean()
# Calculate rolling average
df['rolling_avg'] = df['value'].rolling(window=3).mean()
# Shift data for comparison (e.g., yesterday’s value)
df['prev_day'] = df['value'].shift(1)
print(df)
print("Resampled:\n", resampled)
value rolling_avg prev_day 2023-01-01 173 NaN NaN 2023-01-02 139 NaN 173.0 2023-01-03 176 162.666667 139.0 2023-01-04 111 142.000000 176.0 2023-01-05 172 153.000000 111.0 2023-01-06 164 149.000000 172.0 2023-01-07 179 171.666667 164.0 2023-01-08 149 164.000000 179.0 2023-01-09 109 145.666667 149.0 2023-01-10 107 121.666667 109.0 Resampled: value 2023-01-01 162.666667 2023-01-04 149.000000 2023-01-07 145.666667 2023-01-10 107.000000
*This is the backbone of forecasting models. Whether you’re predicting sales or analyzing sensor patterns, these time-based manipulations are the first step toward insights!*