63: Mistakes to avoid - Machine Learning in Healthcare
I’ve worked and consulted on my fair share of projects applying machine learning to healthcare over the last few years. So this week I thought I’d share five common mistakes that I have made, or have seen groups making:
1️⃣ Trying to answer the wrong question
AI is great for analysing complex inputs and for personalising outputs to an individual.
It's less helpful when, for example:
Interpretation of the model is important
Sensitive decision-making is involved (such as withdrawing life support)
It’s important to find the right technology for the problem - not the other way around.
2️⃣ Not having the right data
Do you have enough data? (Answer: possibly. But you should still try and get more)
How is the data labelled? Biopsy > many doctors > 1 doctor’s interpretation. Bad ground truth = bad model.
The data should cover the entire domain of intended use. This means different demographics, geographical sites and a variety of presentations.
3️⃣ Involving ML scientists too late
ML expertise is needed to build the model - but it shouldn't start there. Consult someone who understands data early. Early advice can change the path of a project for the better.
You can get ML expertise from local hospitals and research institutions or - if needed - through collaboration with a commercial organisation.
4️⃣ Not involving doctors
You need clinicians to frame the clinical question and to help collect and annotate the data.
You can build a sophisticated ML model, but it needs to make sense clinically and fit into existing workflows.
5️⃣ Not planning how you'll monitor the algorithms' performance
Just because an AI model is performing well when you deploy it doesn't guarantee it will stay that way.
You need a way to detect when it changes - and an action plan to respond if and when it does.
💬 What have I missed?
Any other common mistakes you’ve made, or seen other’s make?
This week’s emails started out life as a tweet thread 🧵:

This week I shared my book summary for Personalised Diet by Eran Segal and Erin Elinav. I read this while I was working for ZOE on a project to predict blood glucose levels with ML.
It turns out response to different foods is highly individualised. We haven’t really been able to study this stuff until recently, so nutrition has a pretty bad rep in science circles - but now we can, and it’s fascinating.
I’m going to be doing a continuous glucose monitoring experiment in the coming months - will let you know how it goes.
You can read the book summary here.
That’s everything - have a great week!
Chris
💌 Enjoy this email?
Please click the heart below, and forward the email to a friend!
About Me
Hi! I’m Chris Lovejoy, a Junior Doctor and Data Scientist based in London.
I’m on a mission to improve healthcare through technology (particularly AI / machine learning), and share what I learn along the way.
In this weekly newsletter, I share my top thoughts and learnings from each week, as well as links to the best things on the internet that I come across.