⏳
Loading cheatsheet...
ARIMA, Prophet, LSTMs, transformers, feature engineering, cross-validation, anomaly detection, and forecasting.
Time series analysis deals with data collected sequentially over time. Key concepts include stationarity, seasonality, trends, autocorrelation, and forecasting models.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# ── Time Series Data Setup ──
df = pd.read_csv('sales.csv', parse_dates=['date'], index_col='date')
df = df.asfreq('D') # Set frequency to daily
df = df.sort_index()
# ── Key Components ──
# 1. Trend: long-term direction (upward/downward)
# 2. Seasonality: repeating patterns (daily, weekly, yearly)
# 3. Cyclical: non-fixed period fluctuations (business cycles)
# 4. Residual/Noise: random, unpredictable variation
# ── Stationarity Check ──
from statsmodels.tsa.stattools import adfuller
def check_stationarity(series):
result = adfuller(series.dropna())
print(f"ADF Statistic: {result[0]:.4f}")
print(f"p-value: {result[1]:.4f}")
print("Stationary" if result[1] < 0.05 else "Non-stationary")
# ── Making Series Stationary ──
df['diff'] = df['sales'].diff() # First differencing
df['log'] = np.log(df['sales']) # Log transform
df['log_diff'] = df['log'].diff() # Log + diff
# ── Rolling Statistics ──
df['rolling_mean'] = df['sales'].rolling(window=7).mean()
df['rolling_std'] = df['sales'].rolling(window=7).std()
df['ewma'] = df['sales'].ewm(span=7).mean() # Exponential weighted
# ── Autocorrelation ──
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
# ACF: correlation with lagged values (including indirect)
# PACF: direct correlation with lagged values (removing indirect)
# Use ACF for MA order (q), PACF for AR order (p)| Component | Pattern | Period | Detection |
|---|---|---|---|
| Trend | Long-term increase/decrease | No fixed period | Visual inspection, rolling mean |
| Seasonality | Repeating pattern | Fixed (daily, weekly, yearly) | ACF peaks at seasonal lags, seasonal decomposition |
| Cyclic | Wave-like, not fixed period | 2-10 years typical | Visual inspection, longer rolling windows |
| Noise/Residual | Random fluctuation | None | What remains after removing other components |
ARIMA (AutoRegressive Integrated Moving Average) is the most widely used classical time series model. It combines autoregression, differencing, and moving average components.
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import acf, pacf
# ── ARIMA(p, d, q) Parameters ──
# p = AR order (use PACF to determine)
# d = differencing order (make stationary, usually 0 or 1)
# q = MA order (use ACF to determine)
# ── Determine Orders ──
# 1. Check stationarity (d = 0 if stationary, 1 if not)
# 2. Look at PACF plot for AR(p): where PACF cuts off
# 3. Look at ACF plot for MA(q): where ACF cuts off
# ── Fit ARIMA ──
model = ARIMA(df['sales'], order=(2, 1, 2))
result = model.fit()
print(result.summary())
# ── Forecast ──
forecast = result.forecast(steps=30)
# forecast with confidence intervals
forecast_ci = result.get_forecast(steps=30)
ci = forecast_ci.conf_int(alpha=0.05)
# ── Auto ARIMA (auto-select best p,d,q) ──
import pmdarima as pm
auto_model = pm.auto_arima(
df['sales'],
seasonal=True, # SARIMA if seasonal
m=7, # seasonal period (7 for weekly)
d=None, # auto-detect differencing
D=None, # auto-detect seasonal differencing
trace=True, # print progress
stepwise=True, # faster search
information_criterion='aic',
)
print(f"Best order: {auto_model.order}")
print(f"Best seasonal order: {auto_model.seasonal_order}")
# ── SARIMA(p,d,q)(P,D,Q,s) for seasonal data ──
sarima = ARIMA(df['sales'], order=(1,1,1),
seasonal_order=(1,1,1,12)) # s=12 for monthly
result = sarima.fit()
forecast = result.forecast(steps=12)| Condition | Model | Parameters |
|---|---|---|
| ACF tails off, PACF cuts off at lag p | AR(p) | Set p from PACF cutoff, q=0 |
| PACF tails off, ACF cuts off at lag q | MA(q) | Set q from ACF cutoff, p=0 |
| Both tail off | ARMA(p,q) | Both p and q non-zero, try small values (1,1) |
| Seasonal pattern | SARIMA | Add (P,D,Q,s) with s = seasonal period |
| Trend present | ARIMA with d=1 or 2 | Difference to remove trend |
Prophet is an additive forecasting model designed for business time series with strong seasonal effects and multiple seasonalities.
from prophet import Prophet
import pandas as pd
# ── Prophet requires columns: ds (datetime), y (value) ──
df = pd.DataFrame({
'ds': pd.date_range('2020-01-01', periods=365, freq='D'),
'y': sales_data,
})
# ── Basic Prophet ──
model = Prophet(
growth='linear', # 'linear' or 'logistic'
changepoints=None, # Auto-detect change points
n_changepoints=25, # Number of potential changepoints
changepoint_range=0.8, # Proportion of history for changepoints
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=False,
seasonality_mode='additive', # 'additive' or 'multiplicative'
)
model.fit(df)
# ── Forecast ──
future = model.make_future_dataframe(periods=90)
forecast = model.predict(future)
# ── Plot ──
fig = model.plot(forecast)
fig2 = model.plot_components(forecast)
# ── Holidays and Special Events ──
holidays = pd.DataFrame({
'holiday': ['diwali', 'christmas', 'eid'],
'ds': pd.to_datetime(['2024-11-01', '2024-12-25', '2024-04-10']),
'lower_window': -1, # Days before holiday
'upper_window': 1, # Days after holiday
})
model = Prophet(holidays=holidays)
model.fit(df)
# ── Custom Seasonality ──
model = Prophet()
model.add_seasonality(name='monthly', period=30.5, fourier_order=5)
model.add_seasonality(name='quarterly', period=91.25, fourier_order=10)
model.fit(df)
# ── Cross Validation ──
from prophet.diagnostics import cross_validation, performance_metrics
df_cv = cross_validation(model, initial='365 days',
period='30 days', horizon='90 days')
df_perf = performance_metrics(df_cv)
print(df_perf[['mape', 'rmse']].mean())| Feature | Prophet | ARIMA |
|---|---|---|
| Ease of Use | Very easy, automatic tuning | Requires expertise to select orders |
| Multiple Seasonalities | Built-in (daily, weekly, yearly, custom) | Limited (SARIMA only one) |
| Missing Data | Handles automatically | Requires complete data |
| Holidays | Built-in holiday effects | Manual feature engineering |
| Changepoints | Automatic detection | Not directly supported |
| Scalability | Thousands of series | One series at a time |
| Interpretability | Component plots (trend, seasonality) | ACF/PACF analysis |
| Accuracy | Good for business time series | Better for short-term, small data |
LSTMs handle long-term dependencies in sequential data, making them well-suited for complex time series with non-linear patterns.
import torch
import torch.nn as nn
import numpy as np
from sklearn.preprocessing import MinMaxScaler
# ── Data Preparation ──
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df['sales'].values.reshape(-1, 1))
def create_sequences(data, seq_length=30):
X, y = [], []
for i in range(len(data) - seq_length):
X.append(data[i:i+seq_length])
y.append(data[i+seq_length])
return np.array(X), np.array(y)
X, y = create_sequences(scaled_data, seq_length=30)
X = torch.FloatTensor(X) # shape: (N, 30, 1)
y = torch.FloatTensor(y)
# ── LSTM Model ──
class TimeSeriesLSTM(nn.Module):
def __init__(self, input_dim=1, hidden_dim=64, num_layers=2, output_dim=1):
super().__init__()
self.lstm = nn.LSTM(
input_size=input_dim,
hidden_size=hidden_dim,
num_layers=num_layers,
batch_first=True,
dropout=0.2,
)
self.fc = nn.Sequential(
nn.Linear(hidden_dim, 32),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(32, output_dim),
)
def forward(self, x):
lstm_out, _ = self.lstm(x)
# Use last time step
return self.fc(lstm_out[:, -1, :])
model = TimeSeriesLSTM()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.MSELoss()
# ── Training ──
for epoch in range(100):
model.train()
optimizer.zero_grad()
output = model(X)
loss = criterion(output, y)
loss.backward()
optimizer.step()
# ── Forecasting (recursive) ──
model.eval()
last_sequence = torch.FloatTensor(scaled_data[-30:].reshape(1, 30, 1))
forecast = []
with torch.no_grad():
for _ in range(30):
pred = model(last_sequence)
forecast.append(pred.item())
last_sequence = torch.cat([last_sequence[:, 1:, :], pred.view(1, 1, 1)], dim=1)
# Inverse transform
forecast = scaler.inverse_transform(np.array(forecast).reshape(-1, 1))| Test | Null Hypothesis | p < 0.05 Means | When to Use |
|---|---|---|---|
| ADF (Augmented Dickey-Fuller) | Series has unit root (non-stationary) | Stationary | Most common, first check |
| KPSS | Series is trend-stationary | Non-stationary | Complement to ADF |
| PP (Phillips-Perron) | Series has unit root | Stationary | Robust to serial correlation |
| Transformation | Handles | How It Works |
|---|---|---|
| Differencing (d=1) | Linear trend | y_t - y_{t-1} |
| Second differencing (d=2) | Quadratic trend | diff of diff |
| Log transform | Exponential growth | log(y_t) |
| Box-Cox | Non-constant variance | Optimal power transform |
| Seasonal differencing | Seasonality | y_t - y_{t-s} |
| Detrending | Linear/quadratic trend | Subtract fitted trend line |
# ── Time-Based Features ──
df['year'] = df.index.year
df['month'] = df.index.month
df['day_of_week'] = df.index.dayofweek
df['hour'] = df.index.hour
df['is_weekend'] = df.index.dayofweek >= 5
df['quarter'] = df.index.quarter
df['week_of_year'] = df.index.isocalendar().week
# ── Lag Features ──
for lag in [1, 7, 14, 30, 365]:
df[f'lag_{lag}'] = df['sales'].shift(lag)
# ── Rolling Window Features ──
for window in [7, 14, 30]:
df[f'rolling_mean_{window}'] = df['sales'].rolling(window).mean()
df[f'rolling_std_{window}'] = df['sales'].rolling(window).std()
df[f'rolling_min_{window}'] = df['sales'].rolling(window).min()
df[f'rolling_max_{window}'] = df['sales'].rolling(window).max()
# ── Expanding Window Features ──
df['expanding_mean'] = df['sales'].expanding().mean()
df['expanding_std'] = df['sales'].expanding().std()
# ── Difference Features ──
df['diff_1'] = df['sales'].diff(1)
df['diff_7'] = df['sales'].diff(7) # Week-over-week change
# ── Percent Change ──
df['pct_change_1'] = df['sales'].pct_change(1)
df['pct_change_7'] = df['sales'].pct_change(7)| Metric | Formula | Interpretation | Best For |
|---|---|---|---|
| MAE | Mean(|actual - predicted|) | Average absolute error (same units as data) | Interpretable, all errors weighted equally |
| RMSE | Sqrt(Mean((actual - predicted)^2)) | Root mean squared error (penalizes large errors) | When large errors are especially bad |
| MAPE | Mean(|actual - predicted|/|actual|) * 100 | Percentage error (unitless) | Comparing across different scales |
| SMAPE | Mean(2*|A-P|/(|A|+|P|)) * 100 | Symmetric MAPE (bounded 0-200%) | When actual values can be zero |
| R-squared | 1 - SS_res / SS_tot | Variance explained (0-1, higher better) | Overall model fit quality |