Earlier this week we shared our Reliability Report, which shows that we’ve made Monzo more reliable in the last 12 months and explains how we’ve done it.
Monitoring helps us make sure everything’s working as expected, and alerts us when things do go wrong.
If you’ve ever wondered how we do it, Platform Team lead Chris digs into the details here
Ah and hundreds of physical servers or a few physical servers running hundreds of virtualised instances?
I want it to be physical but it’s obvs gonna be virtual
My balance doesn’t feel so good…
don’t feel so good…
༼ つ ◕_ ◕ ༽つ
༼ つ ◕_ ◕ ::;:.::…:. . . . . . . . . . . .
༼ つ ◕_ :;:.::…:. . . . . . . . . . . . . .
༼ つ :;:.::…:::…:.:… . . . . . . . . .
༼ ;::,’:;:.::…:::…:.:… . . . . . . . . .
While we do have some hardware in physical data centers to interconnect with payment schemes, most of our servers are virtual servers running on EC2, Amazon’s cloud.
Most of these servers are Kubernetes “worker” nodes. Each of these workers runs many different microservices, each in its own container. So we have containers on top of virtual servers on top of physical servers in a few Amazon data centers somewhere…
Why do you “want” it to be physical?!
Because a bank of physical servers with lots of cables and lights is nerd heaven
Only if the cable management is good otherwise it’s hell
Hundreds of virtualised servers running thousands of containers.
Wow that blog post brings loads of questions to mind. Can we nominate Chris for the next Q&A? @simonb
On holiday for a week but happy to answer anything when I get back
This is great, just rolling out Prometheus and Thanos across my cloud.
Any chance you can share some further details of your custom template for slack notifications along with details on rules fetcher?
Sounds like a missing piece of the puzzle for my implementation