Monzo & the Recent AWS Outage

anon41628722 · 4 March 2017 14:11

Continuing the discussion from Why did Monzo choose AWS:

As I read through the post-mortem for the recent AWS outage, it got me thinking: what sort of mitigation strategies are in place if something like this were to affect Monzo systems? Does Monzo have the ability yet to run independently of AWS (perhaps even if only in a temporary, reduced capacity)? The linked post notes this as a long-term goal, which is great - has the recent AWS outage driven any Monzo infrastructure design review and/or accelerated any of these plans?

As someone who’s written similar post-mortem reports and been in the middle of some pretty ugly outages, I fully appreciate that even with multiple layers of redundancies, things can and do still go wrong. I’ve watched the talk @anon77247897 gave and found it to be quite interesting - while it focused more on the software architecture side (I’m admittedly not terribly familiar with kubernetes), I can’t help but wonder what sort of resiliency Monzo has at the infrastructure level.

From an outsider’s perspective, it appears the Monzo team has so far successfully balanced risk mitigation and FinTech badassery (not an easy thing to do!). I can only hope resiliency at all levels remains a fundamental value as the organization continues to grow. Thanks!

Edit: I just came across Mondo infrastructure, so I guess my question might shift a bit to: since the forum post and the video about Go, have the Monzo infrastructure strategies had to change much, and if so, how?

Topic		Replies	Views
Outages Monzo Chat	19	5028	21 September 2016
Mondo infrastructure Monzo Chat	4	1981	6 June 2016
Tolerating full cloud outages with Monzo Stand-in Making Monzo	5	563	18 February 2025
Why did Monzo choose AWS Monzo Chat	4	3321	5 March 2017
This posted has been removed Monzo Chat	17	2593	8 September 2018

Monzo & the Recent AWS Outage

Related topics