Are people redacting financial data before using AI tools?

Hi all,

I’ve noticed more people (myself included) using tools like ChatGPT to help draft complaints, understand transactions, summarise statements, or sense-check financial emails.

One thing I’ve been thinking about is data redaction before sending anything to public LLMs.

For example:

  • Account numbers

  • Transaction references

  • Full names & addresses

  • Screenshots of balances

  • PDFs with embedded metadata

Even when platforms say they don’t train on your data, there’s still human error risk — copy/paste mistakes, screenshots shared accidentally, browser extensions logging content, etc.

I started looking into this more seriously after realising how easy it is to paste identifiable financial info without thinking.

Out of curiosity (and partly because I’m building something in this space — Questa-AI, focused on automatic redaction before LLM processing), I’ve been experimenting with ways to reduce that risk.

But I’m more interested in how others approach it:

  • Do you manually redact?

  • Use regex scripts?

  • Just rely on trust in the platform?

  • Avoid AI for financial content altogether?

Feels like this might become more relevant as AI usage becomes normalised in financial workflows.

Would be great to hear how others think about this.

1 Like

Was this written by AI too?

2 Likes

I haven’t used LLMs for anything like what you mention yet, but I do use it for coding and system administration tasks. I do redact domain names, usernames, passwords, keys, etc, as I don’t want them to be visible to the LLM owners (and be included in data sharing/selling etc).

Other than a security risk, that is also a major privacy risk.

One exception is Proton’s Lumo, I trust it’s E2EE so I don’t bother redacting stuff there; that said, Lumo is pretty basic and not as useful as other LLMs (yet?) so I don’t use it much anyway.

2 Likes

I don’t trust AI with my finances and wouldn’t want to risk dulling my alertness to errors by trusting it tbh.

If I were anonymising though I’d use regex filtering on a script then do a manual pass for anything it might have missed due to weird formatting in documents.

1 Like