Chaos Engineering is an approach for learning about how your system behaves by applying a discipline of empirical exploration. Just as scientists conduct experiments to study physical and social phenomena, Chaos Engineering uses experiments to learn about a particular system.
We need to identify weaknesses before they manifest in system-wide, aberrant behaviors. Systemic weaknesses could take the form of: improper fallback settings when a service is unavailable; outages when a downstream dependency receives too much traffic; cascading failures when a single point of failure crashes.
Applying Chaos Engineering improves the resilience of a system. You can then address those weaknesses proactively, going beyond the reactive processes that currently dominate most incident response models.
Most of us has been inspired by Netflix and its long history of applying chaos engineering. There have been multiple blogs about its Simian Army and Nora Jones, a former chaos engineering at Netflix, explains why this is important at Re:Invent 2017.
However, we still sometimes struggle with getting companies and teams to invest time and prioritise this approach, allowing us to build systems that provides our customers a much better experience. One fun approach to increase the importance and awareness of it could be to, unplanned, utilise a regular slot such as your usual demo meeting with the product owners and leadership and demonstrate what happens if our systems isn't recovering automatically. Let's get nostalgic and introduce Space Invaders!
KubeInvaders is a gamified Chaos Engineering Tool for Kubernetes, authored by Eugenio Marzo. It is like Space Invaders, but the aliens are the pods which you can shoot down (kill) across you Kubernetes cluster. It's really straight forward to setup and could grasp the curiosity of the team, increasing everyones awareness.
KubeInvaders is a game and just an example so please do not take it too seriously but it demonstrates some important use cases and raises the point.
If you're curious about it or want to contribute, go to https://kubernetes.io/blog/2020/01/22/kubeinvaders-gamified-chaos-engineering-tool-for-kubernetes
If you have other tips and tricks on getting an organisations attention on chaos engineering, please share!