Ensuring the health of a modern large-scale recommendation system is a very challenging problem. To address this, we need to put in place proper logging, sophisticated exploration policies, develop ML-interpretability tools or even train new ML models to predict/detect issues of the main production model. In this talk, we shine a light on this less-discussed but important area and share some of th
![Best practices for operating large-scale recommender systems](https://cdn-ak-scissors.b.st-hatena.com/image/square/8ec687e1a87281272b05bc67f1c17af90fb83bab/height=288;version=1;width=512/https%3A%2F%2Fcdn.slidesharecdn.com%2Fss_thumbnails%2Frecsysops9-210930103327-thumbnail.jpg%3Fwidth%3D640%26height%3D640%26fit%3Dbounds)