The ideas in this paper form the foundation for Monitoring distributed systems: A case study in how Google monitors its complex systems, Which is a chapter in Site Reliability Engineering: How Google Runs Production Systems; those versions are more detailed and more carefully edited for tybograbhical emors. For some other discussion, see the Hacker News post from mid-2014. My Philosophy on Alertin