Interesting read, thanks for sharing! You mentioned 30-50% accuracy for predicting the category of an inquiry using an RNN. You also say that RNN requires only “raw” data, what gives them an advantage compared with other ML models. I’m curious if you had a chance to try, for example, Vowpal Wabbit for the multi-class classification problem that you described with exactly the same data - no feature generation? In addition, I wonder if the order of events matters for labelling an inquiry. A bag-of-words or n-grams might do the job in VW. Neural Networks require a lot of data for training and, as you said, are more often a black box. While “classic” ML can deal with much less data, which might matter considering Monzo is just getting started But I appreciate your opening statement that you have a lot more ideas than you could implement! Maybe Kaggle community could help?
Hi Dmitri, sorry for the late response. I’ve unfortunately never tried using Vowpal Wabbit so wouldn’t be able to tell how it compares with regard to performance. But here are a couple of thoughts on BOW. In my view, order and relative time play a big role in modelling of navigational patterns. Because some events might be more or less relevant for a certain prediction given their relative age. Order is equally important because you want to consider certain event only if they occur after certain other events for example. Another advantage is that you can make the prediction one event at a time as they arrive. Imagine you would need to retrieve the latest 200 events for a user every time you run a prediction on it. It would would be quite expensive (cost/time/compute) to deal with so many events every time. Finally, it seems to be working fine even with our medium size data (180k users) I would be really curious to try a BOW model in this particular case just to have a benchmark but I’m afraid I will not have the time
In my own experience, I’ve learned to test any hypotheses in ML, however unbelievable, even if it goes against my logic In particular, I once played with logs of visited websites trying to predict a user to whom such session (of 10 websites) belongs. VW with BOW outperformed an LSTM NN even though I still don’t loose hope to prove that the order of the visited websites matters for user identification in this particular problem (because that’s logical, right?)
I also didn’t quite understand your point around retrieving the latest 200 events every time you run a prediction on it. Why would we need 200 events? For example, here’s my latest experience: I noticed that TfL double charged me. I clicked the first TfL transaction, returned back, clicked another TfL transaction. Being confused, I’m clicking for help. A correct prediction would be a help article about TfL charges that Monzo already has. I imagine, we don’t quite need 200 events, we just need a few that happened during the past couple of minutes (or, if there have been none in the past minutes, grab the last few available ones).
Anyway, interesting discussion! Good luck!
Very interesting post.
It’s nice to see a bank being open with how they use data and analytics. It’s also extremely satisfying to know that my bank isn’t relying on cumbersome SAS code to identify fraud.
The fraud detection use-case is an interesting one, given that Monzo is relatively small, does that mean you have 100% verified labelled data? For the majority of companies, fraud can only be identified if investigated; this investigation is generally rule based, this account had this activity or this account shut after x minutes. If this is the case then unsupervised learning is the only way to pattern match and identity potential new fraudulent activity. RNNs will only narrow this tunnel vision.
Relating to doubling the work by writing a complicated data pipeline, write reusable transformation code in a language that is supported and implementable into the production use case!
I’ve really impressed with the amount of work that you guys have got through with only being a team of one and now two(?). If you are ever hiring a remote data scientist, fire the job ad over here!
Hi Paul, thanks for your comments. Because we primarily deal with first party fraud at the moment we are quite certain that a given user is fraudulent because we usually receive a chargeback if somebody has topped up from a stolen debit card for example. This makes things easier. We also have a rules based system in place which produces quite a few false positives but is very useful as a second line of defence in particular for obviously fraudulent cases.