We do continuous deployment all the time. Sundays are reserved for the larger deployments that require servers to be brought down.
Yeah that’s still not really any excuse.
0 downtime for any system upgrade, regardless of scale, is very possible.
How many times has Facebook, Amazon, iCloud, Google and Gmail been updated? Did Gmail start bouncing emails, or Google stop serving requests, or iCloud stop syncing whilst these upgrades were happening? Absolutely not. Google handles more requests and serves more data than a high street bank and still had 0 downtime, upgrading extremely complicated systems and data structures (thinking of where they came from, to where they are now).
Yes, they have entered an a highly technological age with less baggage to upgrade, but upgrades without downtime have been a thing for a long time, and there is no other reason I can fathom, as a developer, other than lack of technical capacity or laziness.
Exactly. There’s no excuse for downtime in 2016. Especially for organisations that spend billions on their IT systems.
Look I don’t know the ins and outs of the deployment. In order to get Sunday as a processing day we would need every bank to work in the same way. Otherwise we’d have a range of different processing times. Until all do. We can’t do anything.
There isn’t any downtime that will impact the customers or the service the bank provides. I can’t comment any further really without knowing the stats and otherwise I might get into some trouble at work
So far my experience with Mondo’s interbank transfer processes is:
- 2 out of 10 top ups failed ever to appear in Mondo until they were chased manually
- A repayment (due to over topping up one month) has so far taken 10 real life days (aka 6 ‘banking days’), and it hasn’t happened yet
When I use any of my existing current accounts with “legacy banks”, it seems to take 2 hours, even at weekends. Mondo support has been great… but unfortunately the score for transfers in or out is:
Legacy 2, Mondo nil
While it’s fun to bash on banks, you have to remember that all UK banks up until now were built around batch job processing and settlements (usually end of day). In addition, the kind of deployments they’re doing can take a whole day of solid database activity on a mainframe so Sunday is a very useful day.
You also just can’t do agile or any kind of CI in environments where if something fails, management comes down and blames the person who wrote the line that failed, making them and their manager personally responsible for any outages. This quickly leads to teams backing out to building and testing everything so thoroughly that agile becomes impossible. Especially when the threat of the regulator sits about everything you do and your unscheduled downtime or issues during the day could potentially bring parts of the country to a halt and cause mass panic.
Let’s take an issue we had here a while back, transactions not appearing in the feed even after a week. This would be completely unacceptable in a traditional bank where what the user sees must be accurate and correct to their account’s current state. You would not be allowed to push that out to production and it would be tested under every possible state to make sure of that with jobs to verify and report/clean up anything that didn’t complete. Not that the job should find anything because the entire process would be built on endless checks and queue managers that will back out, report and try again if anything doesn’t happen exactly the way it should.
Remember, banking was traditionally a daytime only operation, accounts would be updated and payments would be made in the late evening for overnight settlement. All this real time and 24/7 stuff (debit cards, ATMs, mobile apps, etc.) was built on that incredibly legacy business base over many years. Every technology company @danbeddows listed started very recently in comparison to the banking industry and were built from the start to be 24/7 online operations.
I agree that it’s all mostly corporate and political BS but at this point, to fix it you would have to start again from zero with a new bank built to handle modern, instant technologies. I wonder when someone will do that.
The reason why such services are able to remain operational is they operate in the ‘cloud’ and are able to offload traffic to different locations around the globe while such upgrades happen to servers.
Until recently it was required that traditional banking services where carried out in-house and not ‘in the cloud’. In November 15, the FCA published ‘Proposed guidance for firms outsourcing to the ‘cloud’ and other third-party IT services’ which allows the use of ‘cloud’ technology to be used for main banking services. Which should mean as Banks start to slowly move things over to the cloud, there is a greater chance that there will be close to 0 downtime during upgrades.
I agree that those services are at an advantage because they have the ability to offload their data to other datacentres.
But offsite/3rd party sites or not, it is still no excuse for high street banks: they have multiple datacenter sites managed internally that they can upgrade at their will. If they do not have the capacity, then they should get it. They are billion dollar companies.
All I’m saying, is there is a problem, and the bank can own the problem (more datacenters, better infrastructure etc) or they can pass it on to their customers. The tradition high street banks always pass these sort of problems onto their customers. They are riddled with limitation.
which would cause downtime and therefore more complaints. The bank industry needs to upgrade, but due to age of some of the systems it is near impossible to complete without whole services being re-written for the modern age.
NatWest for example can and does process transactions on Sundays but the date is always the next working days (i.e. Monday or Tuesday if BH).
How would adding more capacity create downtime?
If you build a datacenter with your banking software, and add it to your fleet, you will not be creating any downtime.
I’m not reinventing the wheel here.
You replicate your production environment on a testing environment with your production data, you upgrade the testing environment, you triple check everything is okay, and then you start sending your traffic over there. Then you repeat it all again for the next upgrade. Any data change at your production environment gets upgraded and pushed to your testing environment during the upgrade.
All of the banks data is stored in databases, all of their software runs on servers. The layers of it’s banking software is ancient, but basic concepts like this still apply.
This is fairly standard stuff…
Replication lag could be a problem though. A lot of older software doesn’t handle multiple-master databases well.
365 / 24 / 7 online banking implemented in the mid 1990s, it’s not difficult or new. The so called bank holidays are not fixed, weekends and bank holidays vary by country. It’s just convenient for the established banks to have holidays so they don’t have customers bothering them
There’s a bit more to it than that. But I’m not going there.
That’s not quite how it works. Might be how the top executives work, but not us on the ground
Things don’t exactly happen fast at the tens of petabyte scale!
More like 100 Terabytes per year!
Last time I was in the 24/7 ops center at work we were processing over 25,000 transactions every seconds. That’s a huge amount.
100% agree. I work in finance and these pending transactions drive me crazy
Me too, Monzo doesn’t yet have a way for users to log / manage offline transactions yet either so please do / share your suggested solution for this scenario here too -