Microsoft Recommenders
Best practices for building production-ready recommender systems. Includes notebooks, benchmarks, and utilities for ALS, SAR, NCF, and more.
Surprise: A Python scikit for Recommender Systems
Clean, scikit-learn-compatible library implementing SVD, SVD++, NMF, KNN, and slope-one algorithms with built-in cross-validation.
Matrix Factorization Techniques for Recommender Systems
The seminal paper from the Netflix Prize era. Covers SVD, regularization, temporal dynamics, and implicit feedback integration.
BPR: Bayesian Personalized Ranking from Implicit Feedback
Foundational paper introducing BPR — the dominant pairwise ranking optimization criterion for implicit feedback data in collaborative filtering.
LightFM: Hybrid Recommendation Algorithm
Python implementation of LightFM — combines collaborative and content-based signals. Supports WARP, BPR, logistic, and regression losses.
Deep Neural Networks for YouTube Recommendations
Google's two-stage architecture for YouTube. Covers candidate generation with wide networks and ranking with deep networks.
Neural Collaborative Filtering (NCF)
Replaces the inner product of MF with a neural architecture to learn user-item interaction functions from implicit feedback.
LightGCN: Simplifying Graph Convolution for Recommendation
Removes feature transformation and nonlinear activation from GCN, achieving state-of-the-art with a simpler propagation rule.
RecBole: Unified Recommender System Library
Comprehensive library implementing 73 recommendation algorithms including CF, deep learning, GNN, and sequential models.
Wide & Deep Learning for Recommender Systems
Google's dual architecture combining memorization (wide) and generalization (deep) for app recommendations in Google Play.
PinSage: Graph Convolutional Neural Networks for Web-Scale Recommender Systems
Pinterest's production GNN for recommendation. Handles 3B nodes and 18B edges using importance-based sampling.
Self-Attentive Sequential Recommendation (SASRec)
Applies self-attention to sequential recommendation. Adapts the Transformer architecture to model user behavior sequences.
TensorFlow Recommenders (TFRS)
Google's official library for building retrieval and ranking models. Covers two-tower architectures, multi-task learning, and serving.
BERT4Rec: Sequential Recommendation with Bidirectional Encoder
Applies BERT's masked language modeling to sequential recommendation. Uses bidirectional self-attention for item sequences.
Variational Autoencoders for Collaborative Filtering
Applies VAEs to collaborative filtering, introducing a principled Bayesian approach that outperforms linear models.
Graph Neural Networks for Social Recommendation
Fan et al. Leverages social network information through GNNs for improved recommendations using trust-based social connections.
LLM-based Recommendation: A Survey
Comprehensive survey on using large language models for recommendation, covering prompting, fine-tuning, and agent-based approaches.
AutoRec: Autoencoders Meet Collaborative Filtering
Applies autoencoders to collaborative filtering, showing competitive performance vs SVD with a simpler end-to-end architecture.
MovieLens 25M Dataset
The industry-standard benchmark. 25 million ratings from 162K users on 62K movies. Multiple sizes available from 100K to 25M.
How Netflix's Recommendation Engine Works
In-depth technical blog covering Netflix's two-stage system: candidate generation, ranking, and contextual bandits.
Embedding-Based Retrieval in Facebook Search
Facebook's two-tower dense retrieval system for search. Covers training strategy, negative sampling, and serving at scale.
Amazon Product Reviews (2023)
Updated Amazon review dataset covering 34 product categories with 571M ratings. Includes rich metadata, images, and product graphs.
Practical Recommendations for Gradient-Based Training
Bengio's practical guide to training deep networks, covering regularization, weight init, and optimization — all highly relevant to RecSys.
Evaluation Metrics for Recommender Systems Explained
Clear tutorial covering Precision@K, Recall@K, NDCG, MAP, MRR, Hit Rate, and Coverage with Python implementations.
Building a Two-Tower Model from Scratch
Step-by-step tutorial on implementing a two-tower retrieval model using TensorFlow, complete with negative sampling strategies.
Multi-Armed Bandits for RecSys
Practical guide to using exploration-exploitation strategies (UCB, Thompson Sampling, LinUCB) in production recommendation systems.
Yelp Open Dataset
Large-scale dataset of Yelp reviews, businesses, and user data for multi-domain recommendation research. 6.9M reviews, 150K businesses.