Skip to content

lgaa/Monitorama2016

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

Tackling Alert Fatigue

Accompanying Repository for the "Recovering From Alert Fagitue" talk given at Monitorama 2016 [Slides]

##Abstract Systems that generate numerous critical alerts result in alert fatigue which can result in service outages and developer burnout. My team at Twitter found themselves in this situation. The services had scaled by an order of magnitude in two years and were generating hundreds of alerts per quarter. Over the course of a quarter I led an initiative to decrease the number of alerts, improve the experience of being on call, and increase the reliability of the services. These efforts were incredibly successful reducing the number of critical alerts by 50%. In this talk I’ll discuss the process and alerting best practices we’ve put in places to successfully combat alert fatigue and avoid over alerting in the future.

##References

##Observability at Twitter

##Related Tweets

##Bio Caitie McCaffrey is a Backend Brat and Distributed Systems Diva at Twitter, where she is the Tech Lead of the Observability Team. Prior to that she spent the majority of her career building large scale services and systems that power the entertainment industry at 343 Industries, Microsoft Game Studios, and HBO. Caitie has a degree in Computer Science from Cornell University, and has worked on several video games including Gears of War 2, Gears of War 3, Halo 4, and Halo 5 She maintains a blog at CaitieM.com and frequently discusses technology on Twitter @Caitie

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published