Collaborative filtering, content-based filtering, hybrid approaches, matrix factorization, deep learning models, and evaluation.
Recommendation systems predict user preferences and suggest relevant items. They are core to Netflix, Amazon, Spotify, YouTube, and most modern platforms.
| Approach | How It Works | Data Needed | Example |
|---|---|---|---|
| Content-Based | Recommend items similar to what user liked | Item features, user preferences | Spotify: similar songs based on audio features |
| Collaborative Filtering | Recommend items similar users liked | User-item interaction matrix | Netflix: users who watched X also watched Y |
| Hybrid | Combine content + collaborative signals | Both features and interactions | YouTube: video features + watch history |
| Knowledge-Based | Recommend based on domain rules | Item attributes, user requirements | Real estate: filter by budget, location, beds |
| Demographic | Recommend based on user demographics | Age, gender, location | News: trending in your city |
| Deep Learning | Neural networks learn complex patterns | Any available data | Google: Wide & Deep, DeepFM |
import numpy as np
from scipy.spatial.distance import cosine
# ── User-Item Interaction Matrix ──
# Item1 Item2 Item3 Item4
# U1 [ 5.0 3.0 0.0 1.0]
# U2 [ 4.0 0.0 4.0 2.0]
# U3 [ 1.0 1.0 5.0 4.0]
# U4 [ 0.0 5.0 4.0 5.0]
R = np.array([
[5, 3, 0, 1],
[4, 0, 4, 2],
[1, 1, 5, 4],
[0, 5, 4, 5],
])
# ── Cosine Similarity ──
def cosine_sim(a, b):
return 1 - cosine(a, b)
# User similarity: rows of R
user_sim = np.zeros((4, 4))
for i in range(4):
for j in range(4):
user_sim[i][j] = cosine_sim(R[i], R[j])
# Item similarity: columns of R
item_sim = np.zeros((4, 4))
for i in range(4):
for j in range(4):
item_sim[i][j] = cosine_sim(R[:, i], R[:, j])Collaborative filtering leverages the collective behavior of all users to make recommendations. It assumes users who agreed in the past will agree in the future.
import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse.linalg import svds
# ── User-Based CF: Find similar users, recommend their items ──
def user_based_cf(R, user_idx, k=2):
# 1. Compute user similarities
user_sim = np.zeros((R.shape[0], R.shape[0]))
for i in range(R.shape[0]):
for j in range(R.shape[0]):
if i != j:
# Pearson correlation on co-rated items
mask = (R[i] > 0) & (R[j] > 0)
if mask.sum() > 1:
user_sim[i][j] = np.corrcoef(R[i][mask], R[j][mask])[0, 1]
# 2. Find K most similar users
similar_users = np.argsort(user_sim[user_idx])[-k:][::-1]
# 3. Recommend items those users liked that target hasn't seen
unseen = R[user_idx] == 0
scores = np.zeros(R.shape[1])
for su in similar_users:
scores += user_sim[user_idx][su] * R[su]
scores[~unseen] = -1 # Exclude already seen
return np.argsort(scores)[-5:][::-1]
# ── Matrix Factorization (SVD) ──
# R_approx = U * S * V^T (latent factor model)
# User i's rating for item j = dot(u_i, v_j)
R_sparse = csr_matrix(R)
U, sigma, Vt = svds(R_sparse, k=2) # k latent factors
sigma = np.diag(sigma)
predicted = np.dot(np.dot(U, sigma), Vt)
# Predicted rating for user 0, item 2 (previously 0):
print(f"Predicted: {predicted[0, 2]:.2f}")| Challenge | Description | Solution |
|---|---|---|
| Cold Start (New User) | No interaction history for new users | Use demographics, popularity, or ask for preferences |
| Cold Start (New Item) | No ratings for new items | Use content-based features or initial seeding |
| Data Sparsity | Most users rate few items (<1% of catalog) | Matrix factorization, regularization, implicit feedback |
| Scalability | User/item count in millions | Approximate nearest neighbors (ANN), SVD, ALS |
| Popularity Bias | Popular items get more exposure, creating feedback loop | Explore/exploit, diversity, coverage metrics |
Content-based filtering recommends items similar to those a user has liked in the past, using item features (text, images, metadata) rather than user behavior.
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
# ── TF-IDF for Text-Based Items ──
documents = [
"Action movie with car chases and explosions",
"Romantic comedy about love in Paris",
"Sci-fi thriller set in space with aliens",
"Documentary about ocean marine life",
"Action thriller with martial arts fighting",
]
tfidf = TfidfVectorizer(stop_words='english', max_features=500)
tfidf_matrix = tfidf.fit_transform(documents)
# ── Compute Item Similarity Matrix ──
item_similarity = cosine_similarity(tfidf_matrix)
# ── Recommend similar items ──
def recommend_similar(item_idx, top_n=3):
scores = item_similarity[item_idx]
scores[item_idx] = -1 # Exclude itself
return np.argsort(scores)[-top_n:][::-1]
# ── User Profile Construction ──
# Build user profile as weighted average of liked item vectors
liked_items = [0, 4] # User liked items 0 and 4 (action movies)
user_profile = np.zeros(tfidf_matrix.shape[1])
for idx in liked_items:
user_profile += tfidf_matrix[idx].toarray().flatten()
user_profile = user_profile / len(liked_items)
# ── Recommend items matching user profile ──
scores = cosine_similarity([user_profile], tfidf_matrix).flatten()
scores[liked_items] = -1 # Exclude already liked
top_recs = np.argsort(scores)[-3:][::-1]
print(f"Recommended items: {top_recs}")Matrix factorization decomposes the user-item interaction matrix into lower-dimensional latent factor matrices, capturing hidden patterns in user preferences.
import numpy as np
# ── SGD-based Matrix Factorization ──
class MatrixFactorization:
def __init__(self, n_users, n_items, n_factors=20, lr=0.01,
reg=0.02, n_epochs=100):
self.n_factors = n_factors
self.lr = lr
self.reg = reg
self.U = np.random.normal(0, 0.1, (n_users, n_factors))
self.V = np.random.normal(0, 0.1, (n_items, n_factors))
self.bias_u = np.zeros(n_users)
self.bias_i = np.zeros(n_items)
self.global_mean = 0
def fit(self, R, mask):
n_ratings = mask.sum()
for epoch in range(self.n_epochs):
total_loss = 0
for u in range(R.shape[0]):
for i in range(R.shape[1]):
if mask[u, i] == 0:
continue
pred = (self.global_mean + self.bias_u[u] +
self.bias_i[i] + self.U[u] @ self.V[i])
error = R[u, i] - pred
total_loss += error ** 2
# Update biases
self.bias_u[u] += self.lr * (error - self.reg * self.bias_u[u])
self.bias_i[i] += self.lr * (error - self.reg * self.bias_i[i])
# Update latent factors
u_old = self.U[u].copy()
self.U[u] += self.lr * (error * self.V[i] - self.reg * self.U[u])
self.V[i] += self.lr * (error * u_old - self.reg * self.V[i])
if (epoch + 1) % 20 == 0:
rmse = np.sqrt(total_loss / n_ratings)
print(f"Epoch {epoch+1}: RMSE = {rmse:.4f}")
def predict(self, u, i):
return (self.global_mean + self.bias_u[u] +
self.bias_i[i] + self.U[u] @ self.V[i])| Method | Key Idea | Pros | Cons |
|---|---|---|---|
| Truncated SVD | Decompose R into U, Sigma, Vt | Fast, deterministic | Cannot handle missing values |
| SGD MF | Gradient descent on latent factors | Handles missing values well | Slower, sensitive to hyperparams |
| ALS (Alternating Least Squares) | Fix U, solve V, then fix V, solve U | Parallelizable (distributed) | Slower convergence |
| NMF (Non-negative MF) | Force all factors to be non-negative | Interpretable latent factors | Constrained solution space |
| SVD++ | SVD + implicit feedback signals | Better with implicit data | More complex |
| Metric | Measures | Formula/Description | Good Range |
|---|---|---|---|
| RMSE | Prediction accuracy | Sqrt(Mean((predicted - actual)^2)) | < 0.9 for 1-5 scale |
| MAE | Prediction accuracy | Mean(|predicted - actual|) | < 0.7 for 1-5 scale |
| Precision@K | Relevance of top-K | Relevant in top K / K | > 0.1 is good for large catalogs |
| Recall@K | Coverage of relevant items | Relevant in top K / All relevant | > 0.3 is decent |
| NDCG@K | Ranking quality | Discounted cumulative gain, normalized | > 0.5 is good |
| MAP@K | Mean Average Precision | Mean of precision at each relevant rank | > 0.2 is good |
| Coverage | Catalog diversity | Items recommended / Total items | > 5% of catalog |
| Diversity | Item dissimilarity | Average pairwise distance of recommendations | Higher = more diverse |
| System | Approach | Used By | Key Innovation |
|---|---|---|---|
| Wide & Deep | Linear (memorization) + DNN (generalization) | Google Play | Combine memorization and generalization |
| DeepFM | FM + Deep Neural Network | Huawei | Factorization machines with deep layers |
| DIN (Deep Interest) | Attention over user behavior history | Alibaba | Captures diverse interests with attention |
| Transformer4Rec | Transformer over sequential interactions | Various | Self-attention for session-based rec |
| SASRec | Self-Attentive Sequential Rec | Various | Transformer for next-item prediction |
| Graph Neural Network | GNN over user-item bipartite graph | Pinterest, Uber | Collaborative signals via message passing |