Time Series Cheatsheet Cheatsheet

📊

Time Series Fundamentals

Core Concepts

Time series analysis deals with data collected sequentially over time. Key concepts include stationarity, seasonality, trends, autocorrelation, and forecasting models.

ts_basics.py

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# ── Time Series Data Setup ──
df = pd.read_csv('sales.csv', parse_dates=['date'], index_col='date')
df = df.asfreq('D')  # Set frequency to daily
df = df.sort_index()

# ── Key Components ──
# 1. Trend: long-term direction (upward/downward)
# 2. Seasonality: repeating patterns (daily, weekly, yearly)
# 3. Cyclical: non-fixed period fluctuations (business cycles)
# 4. Residual/Noise: random, unpredictable variation

# ── Stationarity Check ──
from statsmodels.tsa.stattools import adfuller
def check_stationarity(series):
    result = adfuller(series.dropna())
    print(f"ADF Statistic: {result[0]:.4f}")
    print(f"p-value: {result[1]:.4f}")
    print("Stationary" if result[1] < 0.05 else "Non-stationary")

# ── Making Series Stationary ──
df['diff'] = df['sales'].diff()        # First differencing
df['log'] = np.log(df['sales'])        # Log transform
df['log_diff'] = df['log'].diff()      # Log + diff

# ── Rolling Statistics ──
df['rolling_mean'] = df['sales'].rolling(window=7).mean()
df['rolling_std'] = df['sales'].rolling(window=7).std()
df['ewma'] = df['sales'].ewm(span=7).mean()  # Exponential weighted

# ── Autocorrelation ──
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
# ACF: correlation with lagged values (including indirect)
# PACF: direct correlation with lagged values (removing indirect)
# Use ACF for MA order (q), PACF for AR order (p)

Time Series Components

Component	Pattern	Period	Detection
Trend	Long-term increase/decrease	No fixed period	Visual inspection, rolling mean
Seasonality	Repeating pattern	Fixed (daily, weekly, yearly)	ACF peaks at seasonal lags, seasonal decomposition
Cyclic	Wave-like, not fixed period	2-10 years typical	Visual inspection, longer rolling windows
Noise/Residual	Random fluctuation	None	What remains after removing other components

📈

ARIMA Models

Classical Forecasting

ARIMA (AutoRegressive Integrated Moving Average) is the most widely used classical time series model. It combines autoregression, differencing, and moving average components.

arima_models.py

from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import acf, pacf

# ── ARIMA(p, d, q) Parameters ──
# p = AR order (use PACF to determine)
# d = differencing order (make stationary, usually 0 or 1)
# q = MA order (use ACF to determine)

# ── Determine Orders ──
# 1. Check stationarity (d = 0 if stationary, 1 if not)
# 2. Look at PACF plot for AR(p): where PACF cuts off
# 3. Look at ACF plot for MA(q): where ACF cuts off

# ── Fit ARIMA ──
model = ARIMA(df['sales'], order=(2, 1, 2))
result = model.fit()
print(result.summary())

# ── Forecast ──
forecast = result.forecast(steps=30)
# forecast with confidence intervals
forecast_ci = result.get_forecast(steps=30)
ci = forecast_ci.conf_int(alpha=0.05)

# ── Auto ARIMA (auto-select best p,d,q) ──
import pmdarima as pm
auto_model = pm.auto_arima(
    df['sales'],
    seasonal=True,        # SARIMA if seasonal
    m=7,                  # seasonal period (7 for weekly)
    d=None,               # auto-detect differencing
    D=None,               # auto-detect seasonal differencing
    trace=True,           # print progress
    stepwise=True,        # faster search
    information_criterion='aic',
)
print(f"Best order: {auto_model.order}")
print(f"Best seasonal order: {auto_model.seasonal_order}")

# ── SARIMA(p,d,q)(P,D,Q,s) for seasonal data ──
sarima = ARIMA(df['sales'], order=(1,1,1),
               seasonal_order=(1,1,1,12))  # s=12 for monthly
result = sarima.fit()
forecast = result.forecast(steps=12)

ARIMA Order Selection Guide

Condition	Model	Parameters
ACF tails off, PACF cuts off at lag p	AR(p)	Set p from PACF cutoff, q=0
PACF tails off, ACF cuts off at lag q	MA(q)	Set q from ACF cutoff, p=0
Both tail off	ARMA(p,q)	Both p and q non-zero, try small values (1,1)
Seasonal pattern	SARIMA	Add (P,D,Q,s) with s = seasonal period
Trend present	ARIMA with d=1 or 2	Difference to remove trend

🔮

Prophet (Facebook/Meta)

Scalable Forecasting

Prophet is an additive forecasting model designed for business time series with strong seasonal effects and multiple seasonalities.

prophet.py

from prophet import Prophet
import pandas as pd

# ── Prophet requires columns: ds (datetime), y (value) ──
df = pd.DataFrame({
    'ds': pd.date_range('2020-01-01', periods=365, freq='D'),
    'y': sales_data,
})

# ── Basic Prophet ──
model = Prophet(
    growth='linear',          # 'linear' or 'logistic'
    changepoints=None,         # Auto-detect change points
    n_changepoints=25,         # Number of potential changepoints
    changepoint_range=0.8,     # Proportion of history for changepoints
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False,
    seasonality_mode='additive',  # 'additive' or 'multiplicative'
)
model.fit(df)

# ── Forecast ──
future = model.make_future_dataframe(periods=90)
forecast = model.predict(future)

# ── Plot ──
fig = model.plot(forecast)
fig2 = model.plot_components(forecast)

# ── Holidays and Special Events ──
holidays = pd.DataFrame({
    'holiday': ['diwali', 'christmas', 'eid'],
    'ds': pd.to_datetime(['2024-11-01', '2024-12-25', '2024-04-10']),
    'lower_window': -1,   # Days before holiday
    'upper_window': 1,    # Days after holiday
})
model = Prophet(holidays=holidays)
model.fit(df)

# ── Custom Seasonality ──
model = Prophet()
model.add_seasonality(name='monthly', period=30.5, fourier_order=5)
model.add_seasonality(name='quarterly', period=91.25, fourier_order=10)
model.fit(df)

# ── Cross Validation ──
from prophet.diagnostics import cross_validation, performance_metrics
df_cv = cross_validation(model, initial='365 days',
                          period='30 days', horizon='90 days')
df_perf = performance_metrics(df_cv)
print(df_perf[['mape', 'rmse']].mean())

Prophet vs ARIMA

Feature	Prophet	ARIMA
Ease of Use	Very easy, automatic tuning	Requires expertise to select orders
Multiple Seasonalities	Built-in (daily, weekly, yearly, custom)	Limited (SARIMA only one)
Missing Data	Handles automatically	Requires complete data
Holidays	Built-in holiday effects	Manual feature engineering
Changepoints	Automatic detection	Not directly supported
Scalability	Thousands of series	One series at a time
Interpretability	Component plots (trend, seasonality)	ACF/PACF analysis
Accuracy	Good for business time series	Better for short-term, small data

🧠

LSTM for Time Series

Deep Learning

LSTMs handle long-term dependencies in sequential data, making them well-suited for complex time series with non-linear patterns.

lstm_timeseries.py

import torch
import torch.nn as nn
import numpy as np
from sklearn.preprocessing import MinMaxScaler

# ── Data Preparation ──
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df['sales'].values.reshape(-1, 1))

def create_sequences(data, seq_length=30):
    X, y = [], []
    for i in range(len(data) - seq_length):
        X.append(data[i:i+seq_length])
        y.append(data[i+seq_length])
    return np.array(X), np.array(y)

X, y = create_sequences(scaled_data, seq_length=30)
X = torch.FloatTensor(X)  # shape: (N, 30, 1)
y = torch.FloatTensor(y)

# ── LSTM Model ──
class TimeSeriesLSTM(nn.Module):
    def __init__(self, input_dim=1, hidden_dim=64, num_layers=2, output_dim=1):
        super().__init__()
        self.lstm = nn.LSTM(
            input_size=input_dim,
            hidden_size=hidden_dim,
            num_layers=num_layers,
            batch_first=True,
            dropout=0.2,
        )
        self.fc = nn.Sequential(
            nn.Linear(hidden_dim, 32),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(32, output_dim),
        )

    def forward(self, x):
        lstm_out, _ = self.lstm(x)
        # Use last time step
        return self.fc(lstm_out[:, -1, :])

model = TimeSeriesLSTM()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.MSELoss()

# ── Training ──
for epoch in range(100):
    model.train()
    optimizer.zero_grad()
    output = model(X)
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()

# ── Forecasting (recursive) ──
model.eval()
last_sequence = torch.FloatTensor(scaled_data[-30:].reshape(1, 30, 1))
forecast = []
with torch.no_grad():
    for _ in range(30):
        pred = model(last_sequence)
        forecast.append(pred.item())
        last_sequence = torch.cat([last_sequence[:, 1:, :], pred.view(1, 1, 1)], dim=1)

# Inverse transform
forecast = scaler.inverse_transform(np.array(forecast).reshape(-1, 1))

🔬

Stationarity & Tests

Statistical Tests

Stationarity Tests

Test	Null Hypothesis	p < 0.05 Means	When to Use
ADF (Augmented Dickey-Fuller)	Series has unit root (non-stationary)	Stationary	Most common, first check
KPSS	Series is trend-stationary	Non-stationary	Complement to ADF
PP (Phillips-Perron)	Series has unit root	Stationary	Robust to serial correlation

Transformations for Stationarity

Transformation	Handles	How It Works
Differencing (d=1)	Linear trend	y_t - y_{t-1}
Second differencing (d=2)	Quadratic trend	diff of diff
Log transform	Exponential growth	log(y_t)
Box-Cox	Non-constant variance	Optimal power transform
Seasonal differencing	Seasonality	y_t - y_{t-s}
Detrending	Linear/quadratic trend	Subtract fitted trend line

🛠️

Feature Engineering

ML Features

ts_features.py

# ── Time-Based Features ──
df['year'] = df.index.year
df['month'] = df.index.month
df['day_of_week'] = df.index.dayofweek
df['hour'] = df.index.hour
df['is_weekend'] = df.index.dayofweek >= 5
df['quarter'] = df.index.quarter
df['week_of_year'] = df.index.isocalendar().week

# ── Lag Features ──
for lag in [1, 7, 14, 30, 365]:
    df[f'lag_{lag}'] = df['sales'].shift(lag)

# ── Rolling Window Features ──
for window in [7, 14, 30]:
    df[f'rolling_mean_{window}'] = df['sales'].rolling(window).mean()
    df[f'rolling_std_{window}'] = df['sales'].rolling(window).std()
    df[f'rolling_min_{window}'] = df['sales'].rolling(window).min()
    df[f'rolling_max_{window}'] = df['sales'].rolling(window).max()

# ── Expanding Window Features ──
df['expanding_mean'] = df['sales'].expanding().mean()
df['expanding_std'] = df['sales'].expanding().std()

# ── Difference Features ──
df['diff_1'] = df['sales'].diff(1)
df['diff_7'] = df['sales'].diff(7)  # Week-over-week change

# ── Percent Change ──
df['pct_change_1'] = df['sales'].pct_change(1)
df['pct_change_7'] = df['sales'].pct_change(7)

📊

Model Evaluation

Metrics

Forecasting Metrics

Metric	Formula	Interpretation	Best For
MAE	Mean(\|actual - predicted\|)	Average absolute error (same units as data)	Interpretable, all errors weighted equally
RMSE	Sqrt(Mean((actual - predicted)^2))	Root mean squared error (penalizes large errors)	When large errors are especially bad
MAPE	Mean(\|actual - predicted\|/\|actual\|) * 100	Percentage error (unitless)	Comparing across different scales
SMAPE	Mean(2\|A-P\|/(\|A\|+\|P\|)) 100	Symmetric MAPE (bounded 0-200%)	When actual values can be zero
R-squared	1 - SS_res / SS_tot	Variance explained (0-1, higher better)	Overall model fit quality

💬

Interview Questions

Top 8

Q1: What is stationarity and why does it matter?

AnswerStationarity means statistical properties (mean, variance, autocorrelation) do not change over time. It matters because most forecasting models (ARIMA, etc.) assume stationarity. Non-stationary series have trends/seasonality that make patterns unstable. Check with ADF test; fix with differencing, transforms, or decomposition.

Q2: ARIMA vs Prophet vs LSTM for forecasting?

AnswerARIMA: best for small data, short-term, univariate, stationary series. Prophet: best for business time series with multiple seasonalities, holidays, changepoints. Easy to use, handles missing data. LSTM: best for complex non-linear patterns, multivariate inputs, large datasets. Requires more data and tuning.

Q3: How to handle missing values in time series?

AnswerForward fill (use last known value), backward fill, linear interpolation, seasonal interpolation (use same day last week), or model-based imputation (KNN, LSTM). For Prophet: handles missing data natively. For ARIMA: must fill before fitting. Never drop rows in time series (breaks temporal continuity).

Q4: What is the difference between ACF and PACF?

AnswerACF (Autocorrelation Function): correlation between a time series and its lagged values, including indirect effects. PACF (Partial ACF): direct correlation between a time series and a lagged value, removing effects of intermediate lags. Use: ACF pattern suggests MA(q) order. PACF pattern suggests AR(p) order. Both show spike at lag 1 for non-stationary series.

⏳

Loading cheatsheet...

import pandas as pd import numpy as np import matplotlib.pyplot as plt # ── Time Series Data Setup ── df = pd.read_csv('sales.csv', parse_dates=['date'], index_col='date') df = df.asfreq('D') # Set frequency to daily df = df.sort_index() # ── Key Components ── # 1. Trend: long-term direction (upward/downward) # 2. Seasonality: repeating patterns (daily, weekly, yearly) # 3. Cyclical: non-fixed period fluctuations (business cycles) # 4. Residual/Noise: random, unpredictable variation # ── Stationarity Check ── from statsmodels.tsa.stattools import adfuller def check_stationarity(series): result = adfuller(series.dropna()) print(f"ADF Statistic: {result[0]:.4f}") print(f"p-value: {result[1]:.4f}") print("Stationary" if result[1] < 0.05 else "Non-stationary") # ── Making Series Stationary ── df['diff'] = df['sales'].diff() # First differencing df['log'] = np.log(df['sales']) # Log transform df['log_diff'] = df['log'].diff() # Log + diff # ── Rolling Statistics ── df['rolling_mean'] = df['sales'].rolling(window=7).mean() df['rolling_std'] = df['sales'].rolling(window=7).std() df['ewma'] = df['sales'].ewm(span=7).mean() # Exponential weighted # ── Autocorrelation ── from statsmodels.graphics.tsaplots import plot_acf, plot_pacf # ACF: correlation with lagged values (including indirect) # PACF: direct correlation with lagged values (removing indirect) # Use ACF for MA order (q), PACF for AR order (p)

Component

Pattern

Period

Detection

Trend

Long-term increase/decrease

No fixed period

Visual inspection, rolling mean

Seasonality

Repeating pattern

Fixed (daily, weekly, yearly)

ACF peaks at seasonal lags, seasonal decomposition

Cyclic

Wave-like, not fixed period

2-10 years typical

Visual inspection, longer rolling windows

Noise/Residual

Random fluctuation

None

What remains after removing other components

from statsmodels.tsa.arima.model import ARIMA from statsmodels.tsa.stattools import acf, pacf # ── ARIMA(p, d, q) Parameters ── # p = AR order (use PACF to determine) # d = differencing order (make stationary, usually 0 or 1) # q = MA order (use ACF to determine) # ── Determine Orders ── # 1. Check stationarity (d = 0 if stationary, 1 if not) # 2. Look at PACF plot for AR(p): where PACF cuts off # 3. Look at ACF plot for MA(q): where ACF cuts off # ── Fit ARIMA ── model = ARIMA(df['sales'], order=(2, 1, 2)) result = model.fit() print(result.summary()) # ── Forecast ── forecast = result.forecast(steps=30) # forecast with confidence intervals forecast_ci = result.get_forecast(steps=30) ci = forecast_ci.conf_int(alpha=0.05) # ── Auto ARIMA (auto-select best p,d,q) ── import pmdarima as pm auto_model = pm.auto_arima( df['sales'], seasonal=True, # SARIMA if seasonal m=7, # seasonal period (7 for weekly) d=None, # auto-detect differencing D=None, # auto-detect seasonal differencing trace=True, # print progress stepwise=True, # faster search information_criterion='aic', ) print(f"Best order: {auto_model.order}") print(f"Best seasonal order: {auto_model.seasonal_order}") # ── SARIMA(p,d,q)(P,D,Q,s) for seasonal data ── sarima = ARIMA(df['sales'], order=(1,1,1), seasonal_order=(1,1,1,12)) # s=12 for monthly result = sarima.fit() forecast = result.forecast(steps=12)

Condition

Model

Parameters

ACF tails off, PACF cuts off at lag p

AR(p)

Set p from PACF cutoff, q=0

PACF tails off, ACF cuts off at lag q

MA(q)

Set q from ACF cutoff, p=0

Both tail off

ARMA(p,q)

Both p and q non-zero, try small values (1,1)

Seasonal pattern

SARIMA

Add (P,D,Q,s) with s = seasonal period

Trend present

ARIMA with d=1 or 2

Difference to remove trend

from prophet import Prophet import pandas as pd # ── Prophet requires columns: ds (datetime), y (value) ── df = pd.DataFrame({ 'ds': pd.date_range('2020-01-01', periods=365, freq='D'), 'y': sales_data, }) # ── Basic Prophet ── model = Prophet( growth='linear', # 'linear' or 'logistic' changepoints=None, # Auto-detect change points n_changepoints=25, # Number of potential changepoints changepoint_range=0.8, # Proportion of history for changepoints yearly_seasonality=True, weekly_seasonality=True, daily_seasonality=False, seasonality_mode='additive', # 'additive' or 'multiplicative' ) model.fit(df) # ── Forecast ── future = model.make_future_dataframe(periods=90) forecast = model.predict(future) # ── Plot ── fig = model.plot(forecast) fig2 = model.plot_components(forecast) # ── Holidays and Special Events ── holidays = pd.DataFrame({ 'holiday': ['diwali', 'christmas', 'eid'], 'ds': pd.to_datetime(['2024-11-01', '2024-12-25', '2024-04-10']), 'lower_window': -1, # Days before holiday 'upper_window': 1, # Days after holiday }) model = Prophet(holidays=holidays) model.fit(df) # ── Custom Seasonality ── model = Prophet() model.add_seasonality(name='monthly', period=30.5, fourier_order=5) model.add_seasonality(name='quarterly', period=91.25, fourier_order=10) model.fit(df) # ── Cross Validation ── from prophet.diagnostics import cross_validation, performance_metrics df_cv = cross_validation(model, initial='365 days', period='30 days', horizon='90 days') df_perf = performance_metrics(df_cv) print(df_perf[['mape', 'rmse']].mean())

Feature

Prophet

ARIMA

Ease of Use

Very easy, automatic tuning

Requires expertise to select orders

Multiple Seasonalities

Built-in (daily, weekly, yearly, custom)

Limited (SARIMA only one)

Missing Data

Handles automatically

Requires complete data

Holidays

Built-in holiday effects

Manual feature engineering

Changepoints

Automatic detection

Not directly supported

Scalability

Thousands of series

One series at a time

Interpretability

Component plots (trend, seasonality)

ACF/PACF analysis

Accuracy

Good for business time series

Better for short-term, small data

import torch import torch.nn as nn import numpy as np from sklearn.preprocessing import MinMaxScaler # ── Data Preparation ── scaler = MinMaxScaler() scaled_data = scaler.fit_transform(df['sales'].values.reshape(-1, 1)) def create_sequences(data, seq_length=30): X, y = [], [] for i in range(len(data) - seq_length): X.append(data[i:i+seq_length]) y.append(data[i+seq_length]) return np.array(X), np.array(y) X, y = create_sequences(scaled_data, seq_length=30) X = torch.FloatTensor(X) # shape: (N, 30, 1) y = torch.FloatTensor(y) # ── LSTM Model ── class TimeSeriesLSTM(nn.Module): def __init__(self, input_dim=1, hidden_dim=64, num_layers=2, output_dim=1): super().__init__() self.lstm = nn.LSTM( input_size=input_dim, hidden_size=hidden_dim, num_layers=num_layers, batch_first=True, dropout=0.2, ) self.fc = nn.Sequential( nn.Linear(hidden_dim, 32), nn.ReLU(), nn.Dropout(0.2), nn.Linear(32, output_dim), ) def forward(self, x): lstm_out, _ = self.lstm(x) # Use last time step return self.fc(lstm_out[:, -1, :]) model = TimeSeriesLSTM() optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) criterion = nn.MSELoss() # ── Training ── for epoch in range(100): model.train() optimizer.zero_grad() output = model(X) loss = criterion(output, y) loss.backward() optimizer.step() # ── Forecasting (recursive) ── model.eval() last_sequence = torch.FloatTensor(scaled_data[-30:].reshape(1, 30, 1)) forecast = [] with torch.no_grad(): for _ in range(30): pred = model(last_sequence) forecast.append(pred.item()) last_sequence = torch.cat([last_sequence[:, 1:, :], pred.view(1, 1, 1)], dim=1) # Inverse transform forecast = scaler.inverse_transform(np.array(forecast).reshape(-1, 1))

Test

Null Hypothesis

p < 0.05 Means

When to Use

ADF (Augmented Dickey-Fuller)

Series has unit root (non-stationary)

Stationary

Most common, first check

KPSS

Series is trend-stationary

Non-stationary

Complement to ADF

PP (Phillips-Perron)

Series has unit root

Stationary

Robust to serial correlation

Transformation

Handles

How It Works

Differencing (d=1)

Linear trend

y_t - y_{t-1}

Second differencing (d=2)

Quadratic trend

diff of diff

Log transform

Exponential growth

log(y_t)

Box-Cox

Non-constant variance

Optimal power transform

Seasonal differencing

Seasonality

y_t - y_{t-s}

Detrending

Linear/quadratic trend

Subtract fitted trend line

# ── Time-Based Features ── df['year'] = df.index.year df['month'] = df.index.month df['day_of_week'] = df.index.dayofweek df['hour'] = df.index.hour df['is_weekend'] = df.index.dayofweek >= 5 df['quarter'] = df.index.quarter df['week_of_year'] = df.index.isocalendar().week # ── Lag Features ── for lag in [1, 7, 14, 30, 365]: df[f'lag_{lag}'] = df['sales'].shift(lag) # ── Rolling Window Features ── for window in [7, 14, 30]: df[f'rolling_mean_{window}'] = df['sales'].rolling(window).mean() df[f'rolling_std_{window}'] = df['sales'].rolling(window).std() df[f'rolling_min_{window}'] = df['sales'].rolling(window).min() df[f'rolling_max_{window}'] = df['sales'].rolling(window).max() # ── Expanding Window Features ── df['expanding_mean'] = df['sales'].expanding().mean() df['expanding_std'] = df['sales'].expanding().std() # ── Difference Features ── df['diff_1'] = df['sales'].diff(1) df['diff_7'] = df['sales'].diff(7) # Week-over-week change # ── Percent Change ── df['pct_change_1'] = df['sales'].pct_change(1) df['pct_change_7'] = df['sales'].pct_change(7)

Metric

Formula

Interpretation

Best For

MAE

Mean(|actual - predicted|)

Average absolute error (same units as data)

Interpretable, all errors weighted equally

RMSE

Sqrt(Mean((actual - predicted)^2))

Root mean squared error (penalizes large errors)

When large errors are especially bad

MAPE

Mean(|actual - predicted|/|actual|) * 100

Percentage error (unitless)

Comparing across different scales

SMAPE

Mean(2*|A-P|/(|A|+|P|)) * 100

Symmetric MAPE (bounded 0-200%)

When actual values can be zero

R-squared

1 - SS_res / SS_tot

Variance explained (0-1, higher better)

Overall model fit quality