Hello!In the Cloud Incident Response Framework - A Quick Guide and in Module 2 Unit 7 of the CCSK training they mention chaos engineering as a technique used in production. I was wondering if anyone could provide a little more insight about chaos engineering. Specifically, is it very commonly used? Are there any downsides to using it? Or unfortunate mishaps that have happened in the past while chaos engineering?
Look at https://en.wikipedia.org/wiki/Chaos_engineering and/or google Chaos Monkey.
15 years ago I did a tour of the Phoenix data centre for Stratus (fault tolerant computing) the director showing me around, unplugged a live and running processor card (a chaos monkey) supporting the US 911 system - and then said to their ops centre "the alarm you just got on xxxx processor card was me!".
As Peter said - takes great confidence in your engineering to be able to do that.