Hi @jenwilkinson! I think you have a very important job there, more so than most engineers perhaps realise. IMHO most companies are slowly (or, um, quickly) declining because they do not do the things that youâre responsible for. I have⌠questions. So, so many questions .
Firstly how are systems documented? How is it all linked together? Say youâre in the guts of some service or other and it points at another service, whatâs the directory that allows you to go from one to the other in terms of the documentation?
Is the documentation versioned in something like git?
Is the documentation able to answer questions from both a developer and a business process standpoint? Can both developers and business people use the same documentation?
Are datatypes at the edges of services a significant part of the documentation effort?
How do you document the âwhyâ of each service? Is there a way to get a global overview of all the âwhy this service existsâ for each and every service? (I know that there are a lot of microservices in Monzoâs microservice architecture
When someone decides to leave, how do you make sure that the contents of their skull is recorded in your knowledge management system? Staff attrition is the company-killer through this precise mechanism.
Is this whole documentation system linked back to the RFQâs that originate most of the things that Monzo builds? (or at least I believe that used to be the case) so that a developer can see the entire history from inception to current state?
While I can imagine that each of the X thousand microservices have their nice individual docs, how do you structure the higher-level architecture of the documentation around that? Are there some âgetting startedâ / intro docs and then a vast chasm all the way down to the coal-face of some service that, say, gets a list of transactions, or are there intermediary levels?
In a related way is there documentation around the flows through the application, down through each microservice? If so is this just prose that gets updated ad-hoc by developers, or is some of this auto-generated from the code itself? Do you auto-generate diagrams along with those code flows?
Do all developers get mandatory training on how to properly adhere to the conventions that you lay out?
How do developers do discovery? If a developer wants to find the service for X, or all the services involved in some process Y, how do they find them?
How do you prevent nomenclature drift? Iâve worked in a number of places where different people invented different, often deeply unintuitive terms for things, which all ended up co-existing. How do you make sure that thereâs one glossary?
How do you maintain quality? Are documentation pull requests vetted deeply and thoroughly by other developers to make sure that they make sense? Itâs often very hard to get outside of your own head to explain complex objects to others as you forget how familiar you are with the objects that you yourself created and all the underlying assumptions that are invisible to you but nonetheless guided the thinking.
Is documentation done primarily before, during, or after development? If all three then which parts are done in which phase of the development cycle.
This sounds ace, understandably you canât stop people from perhaps not being âup to speedâ but how are learnings fed back into the system. Do people have the ability to add themselves (such as a wiki) or is it locked down?
Also how does this work in terms of the COpâs - do they have a similar system then can use to help answer customer enquiries.
Great question. I think youâre referring specifically to our customer operations documentation, which is handled by a separate team at Monzo. However, I have worked with that team previously so Iâll try to answer as best I can!
In the past itâs been difficult for our customer operations staff to find accurate internal docs to help them answer customer queries. Sometimes this has meant weâve given inconsistent messages to customers, which we absolutely do not want.
In the last year, weâve invested heavily in improving this. Thatâs included things like moving all customer support docs to its own dedicated system that is much easier to search, understanding how customer operations staff are interacting with it, and creating new processes to keep that documentation up to date.
Every piece of content in that system has an owner who regularly reviews the content to keep it fresh. As well as proactively monitoring that content, we also have a team of Quality Assurers (QAs) who investigate if things have gone wrong. They will look at things like whether the support person checked the docs while helping the customer.
All of this is a work in progress and more improvements are on the way. Weâre now starting to make better use of the metadata and analytics we have and restructuring whatâs there so we make it much quicker for people to find the right answer.
We have a few different knowledge systems. Wherever possible we try to be as transparent with our internal docs as possible. Unless there are strong reasons why someone shouldnât be able to read a doc (usually something operationally sensitive), everyone has the ability to see most information on those systems.
For engineers, information largely lives in either our main company-wide knowledge system. This is the closest thing we have to a wiki. And we have an internal developer portal where the source files live alongside our code.
We have tighter permissions in those systems for content that folks may be allowed to read, but not to edit. And in the case of our internal developer portal, those permissions are often tied to the same permissions we have around the code. So in the same way that an engineer cannot make a rogue change to some code without the owners seeing and approving it, someone cannot make rogue changes to the associated documentation without it also going under the same checks.
A few years ago, we made the decision to move COps-specific documentation to its own system, better designed for their needs. It was a gnarly migration, but has been well worth it to help COps more easily find the information they need to help customers.
What systems do Monzo use to manage things like document retention and disposal of personally identifiable information? Are you using or planning to use any of the ML tools on the market?
Interesting to hear about your time at the GDS, and the work you do. I suppose my questions are:
What skills/knowledge/experience translated best in the migration from CS to private?
Did you find the culture massively different, or more samey than you expected?
And in relation to this respone:
In the last year, weâve invested heavily in improving this. Thatâs included things like moving all customer support docs to its own dedicated system that is much easier to search
Iâm not sure what system you used in the GDS (I myself have some experience in Ocelot), but do you find that private sector lends itself to a more agile rather than a process driven approach when it comes to updating COps policies?
Iâm ruling out the animal (unwisely?) so my shortlist is a Microsoft API framework, an accountancy package for content producers and AI driven higher education software (no me neither on that last one).
Can you point us in the direction of what this is? (If itâs a secret government code name then I have screenshots and know where the MI5 building is).
Can you point us in the direction of what this is?
Absolutely itâs a web based portal for COps. Not sure who developed it, but itâs used by 3 departments I know of from a 2021 Office for Tax Simplification release. I just made a link between simplifying that and streamlining COps within banking
Iâve not used it myself in quite some years but it seemed quite useful, but of course my experience is that the private sector is more agile and so might not need such a system - I probably didnât word my question that well though, so Iâve done a wee rewording to help
As I think you mentioned in another comment, GDS feels a lot like Monzo and vice versa. When I first met Jonas, it was actually one of the first things we chatted about.
GDS feels very different from other departments. Everyone is there because they really believe in making digital services better for the public. The same is true for Monzo. Folks here believe really passionately about making it easier for people to handle their finances. No one looks forward to having to interact with the government or their bank. Usually when you do, you just want to get something done so you can get on with your day. Working with people who understand and empathise with that sounds basic, but contributes to everything both organisations do.
Where it feels different is the speed at which Monzo can move. In gov land, everything needs a lengthy business case, there are mounds of red tape, and thereâs always a committee for a committee for a committee involved somewhere.
At Monzo, anyone has the genuine autonomy to pitch an idea, get feedback, and run experiments to find out very quickly whether something is worth investing in. And if we find out something hasnât worked, thatâs a great result for us because we know itâs not worth our time and effort. Thatâs not to say there arenât checks and balances in place, but the difference is that youâre working with the system, with the regulations, not against red tape that exists for the sake of it.
Oh my goodness. What amazing questions! Ok here goesâŚ
We have a central catalogue of all the components that power the bank. Think of it like a big long list of all our microservices, web apps, and cron jobs etc. We have an internal developer portal that helps people search that list and find information associated with each component. That could be docs about what the component is, or the microservices upstream or downstream from it.
The source files for that component documentation live in GitHub alongside the code and we use the same processes for reviewing documentation changes as we do for reviewing code changes. Iâm a big big fan of âdocs as codeâ for many reasons, but the review flow is a big part of that.
Everyone at the company has access to the internal developer portal. We also have a central knowledge base (kind of like a company wiki) that the rest of the business also uses.
If you mean service RPC definitions where those services interact with other things then yep! We have strongly defined types. In some places and for the most critical services, we have more detailed API docs, but we need to improve how we handle these and make sure we always provide more contextual information beyond a name and data type.
So with so many components this gets a little tricky and we have to take a pragmatic approach to documentation. Youâre never going to be able to comprehensively document every single little thing and keep it all up to date, and thatâs ok. Our time is often spent better elsewhere. However, each component will have at minimum a very short description of the âwhyâ in its README and some links to related documentation. Iâd consider that part of our âminimum viable docsâ criteria.
Each component is owned by a team rather than an individual. They are responsible for the documentation as well as the code and everything else around the running of that component or collection of components. If someone leaves, ownership of the docs remains with the team.
Minimising the risk of people leaving with critical information and institutional knowledge is really hard and sort of why my job exists in the first place! We encourage teams to document throughout the software development cycle, contribute regularly to talks and knowledge-sharing sessions etc. If someone does leave with some important knowledge, we try to make as much time as possible for them to document before they go. And if theyâre replaced, the new Monzonaut is the perfect litmus test for those docs and to find any knowledge gaps we still need to fill.
Yep! We usually link to the original RFCs (what we call âproposalsâ instead) as part of the docs so folks can see the original justification or design behind the system or component. Thatâs really great, but difficult if you want to understand to what extent the original proposal was implemented and where it differs or why it differs. That additional knowledge is sometimes missing and one of the gaps we want to focus on improving this year.
Solid question. I am loving these! So youâre absolutely right that you can go very deep on certain topics, particularly on a service by service level. But often, our engineers arenât looking for service-specific information. They typically have some kind of task-based question they want the answer to. In those cases, we need more horizontal documentation that cuts across different parts of our architecture like you suggest. We tackle these in a few ways, by doing things like:
Creating system-level documentation that covers multiple components working together on one thing, like a particular feature
Providing entry guides on overarching topics that cross whole systems, like our coding conventions, or how to deploy a change
Those docs are kind of like âlanding pagesâ for the topic. They give an overview of the concepts needed to dive in then act as sign posts to the most relevant content. They tend to live in our central company-wide knowledge system as theyâre often useful for everyone in the business.
We do auto-generate some of our docs, though not as much as I would like. Weâll be exploring more automation in future. In some cases we do auto-generate diagrams too, which is always good to see. Thereâll always be a need for human-generated docs though, often to provide the missing âwhyâ behind an implementation or flow.
Not as much as we should! New joiners are given a tour through our different knowledge systems and given guidance, but I ran some recent user research that showed this isnât enough. One of the projects Iâm involved in at the moment is overhauling our entire engineering onboarding experience. A big part of that will include introducing our conventions and being given the chance to contribute to our docs much earlier. I also run ad hoc training with individual teams or individuals.
From our research, usually by searching our internal developer portal, company-wide knowledge system, or grepping a bunch of files. Findability is really gnarly so weâre always looking for ways to make it easier. Sign-posting to relevant docs and deleting/archiving old information is sometimes the most effective way to keep on top of this. Iâm sure our engineers are very tired of hearing me witter on about the power of the delete button
We have a marvellous writing team who maintain our âTone of Voiceâ, which sets expectations for any kind of writing at Monzo. A big part of that is making sure we use plain language and avoiding unintuitive terms. Youâll often see feedback on pull requests or in proposals suggesting name changes or checking if the user of the thing will be comfortable with the terms.
We donât maintain a central glossary. Iâm sure there will be knowledge management people reading this and flinching at that. While the motivation behind them is good, glossaries put the onus on the reader to go hunting for info. In our view, it should be down to the writer to make it easier to be understood. If you rely on a glossary within a team itâs a sign your terms maybe arenât common enough to be understood and if lots of people are struggling to understand your terms, itâs a sign you might want to design/re-name something or provide better contextual information to help them understand. Also maintaining glossaries is a nightmare and not how anyone wants to spend a Tuesday
This might be my favourite question of them all. For our git-backed docs, documentation pull requests get the same scrutiny as code changes. We treat them exactly the same.
Youâre absolutely right that half the battle is getting out of your own head to think about what your reader needs. Thatâs the old âcurse of knowledgeâ thatâs so hard to shake. Documentation reviews help that, but we also do user research internally to check what people need to know about a thing and during our documentation training we always challenge writers to think about what skills, experience, knowledge, or access someone needs in order to understand and work with the thing weâre documenting. Taking just a few minutes to think about that before putting pen to paper (or hands to keyboard) is really impactful.
Weâve also introduced the start of some automated quality assessments. We have a system called âsoftware excellenceâ that looks at our components and assesses them against certain criteria. It generates a rudimentary âgradeâ to let teams know how their components are doing and help them figure out where to spend their time improving existing things. Part of that âgradeâ is calculated using some very rudimentary documentation measures. For example, how recently the docs were updated, and whether our prose linter has flagged any readability issues. Itâs by no means perfect, but is extremely useful in prompting conversations about what makes good documentation and where documentation improvements are needed. Iâve got some big plans to make this more useful for engineers so will pop onto the blog when weâve improved this.
I was really trying to avoid the answer of âit dependsâ but it does depend! Broadly speaking itâs likely weâll use a mix of documentation artefacts or knowledge-sharing practices for a piece of work. That could be:
A proposal outlining the problem weâre trying to solve, a suggested implementation with the measurements weâll use to figure out if itâs worked, and an assessment of the risks and possible mitigations we need to consider
An approval or discussion in an architecture review meeting
A series of decision records as the project progresses
Skills = writing good business cases/proposals. Being able to articulate a problem and a proposed solution in a way that is meaningful to someone who may be removed from the detail is so important.
Knowledge = the inner workings of weird legacy software. Being able to interrogate the trade-offs engineers have to make on a daily basis.
Experience = Probably project managing big content projects, like migrating to new tools, and learning how to facilitate good training.
In the civil service it sometimes felt like running a series of sprints wearing shoes that were too small while people threw projectiles at you. At Monzo, weâve got to run the same distance, but the shoes fit and your supporters are actually cheering you on. Mentally that makes such a difference.
I explained some of this in a different answer, but the biggest change for me is not having to constantly defend my profession and value. At Monzo, itâs a given that good knowledge-sharing is beneficial for the organisation. A lot of the improvements good knowledge practices bring are intangible. They donât always fit neatly on a graph trending up or down. It takes senior leaders that believe in the value of the marathon and understand that itâs going to take time. We have significant documentation and knowledge debt to pay down while introducing new practices to keep up with the sheer pace the rest of the company is going. I fully credit our engineering leadership for always championing that work. Itâs so refreshing to move beyond the âWhy should we bother?â question and get straight to âOk, so how are we going to improve this?â
I never came across Ocelot in my time in gov, but Iâm sure the support teamsâ needs are very similar.
My time was mostly spent helping other engineers across the civil service understand how to use the tools and platforms that GDS built. And in a meta way, also how to handle their own documentation. We ended up building our own tech stack for docs using a âdocs as codeâ approach. Hereâs an old blog post I wrote about it a few years ago.
At Monzo we have an off-the-shelf product for COps knowledge that only contains information relevant for COps. It makes it a lot easier for them to find the information they need as itâs designed with them in mind from the very start.
I was going to ask how do you make Monzonauts (with emphasis on those that donât work in technical operations) actually read the documentation, but this is answered - in part at least - in a reply further down.
My own personal experience with this kind of thing is where I am at the moment Iâve built pages in Confluence that cover specific topics that come up repeatedly - Slack, tickets, whatever. I spend ages building the pages. I include screenshots. I keep the language simple and with as little technical jargon as possible. I cover different situations and how to deal with them. But then, on Slack usually, Iâm bombarded with the same questions over and over again despite everything thatâs been done to promote the documentation.