The human hardware bias is hindering AI adoption in the health industry
By Rodrigo Vargas
I’ve been recently involved in drafting a proposal for a project tender regarding the use of AI in the diagnosis of prostate cancer. Incoming data will consist of ultrasound images. Now, medical equipment is not an expertise of mine, so in the process I learnt a couple of things about the industry and I’ve come out quite frustrated. The reason for this is that some external limitations turned out to be imposed on the set of admissible AI solutions, artificially restricting the search space and, consequently, predictably forcing us to eventually settle on a sub-optimal solution. Those restrictions result from what one might call the human hardware bias.
What do I mean by that? Well, the relevant human hardware in this context would be our eyes. Of course, our cognitive world is shaped by our senses, and chief among them is vision. Thus, it is quite natural for us to want to see things in order to assess, diagnose and act. But human vision can be extremely inadequate for medical purposes (we’ve evolved to see our living fellows, so they obviously tend to be opaque to visible light and one has to open them up in order to see their inside). Medical diagnosing technology developed in response, in a context in which cognitive tasks were performed exclusively by humans; thus, the industry focused on non-invasive equipment meant to produce images, such as ultrasound devices.
How is that affecting us nowadays? Well, ultrasound devices output images, not sound, even though there is a loss of information in passing from sound to images. But why would an AI perform better on an imperfect, visual representation of the ultrasound data, instead of the ultrasound data itself? The fact that we cannot hear the ultrasound, not to mention understand it without external tools, should be entirely irrelevant. But alas it’s not, for two reasons:
- As already said, ultrasound devices output images. This is to be taken literally: you cannot access the ultrasound data without opening up and modifying the device, thus voiding the certification (as of this writing, there seems to be only one commercially available device which can output raw data; it’s fully programmable, meant for research purposes and not to be found on typical hospitals). That’s obviously a deal-breaker in medical contexts.
- People want you to base your diagnosis on images. Yes, absurd as it is in the dawn of the AI era. Since we’re used to diagnosing with images, it has made it into the very call for tenders the fact that diagnosing software must rely on images as input data. Now, asking for an output image for explainability purposes is quite reasonable; on the contrary, restricting the means for producing those images to artificially reconstructed visual data is not. Who said it’s not possible for a tumor to produce an acoustic signature that’s discarded or dampened upon passing through the standard, ray-based image reconstruction algorithms? At the very least, that grants some unbiased exploration.
But we need not even call into question the image bias to see that there’s a problem here: ray-based image reconstruction algorithms are far from perfect. Recently, Daniela Theis and Ernesto Bonomi of the CRS4 developed an alternative based on seismic prospecting techniques that significantly outperforms ray-based results. One should not at all be surprised if AI can help further improving those reconstruction techniques (for instance, by directly fitting medium parameters in the wave equation via backpropagation through pde solvers, as can already be done using tools from Julia’s scientific machine learning ecosystem). But none of that can be used right now for prostate cancer diagnosis, because ultrasound devices output (bad quality, ray-based reconstructed) images, not sound. It will take several years before Theis and Bonomi’s algorithm makes it to the market of (certified) ultrasound devices, and even that wouldn’t help if AI can in turn outperform that. Ultrasound devices should output sound.
In the AI era, raw data must be directly available, because processed data contains traces of the biases, assumptions and limitations inherent to our cognitive view of the world. As we’ve just seen, lack of raw data availability is already hindering progress in the health industry, and the same phenomenon could perfectly be occurring in other sectors as well. In order to identify such situations, it might help to have the human hardware bias in mind: machines will not necessarily work best with the same input than we do.
Originally published at vargonis.medium.com on December 5, 2020.
Discover more about Ennova Research