Prawns and Probability: December 2016

There have been many calls for a 'data revolution', or even a 'Big Data revolution' in healthcare. Ever since the completion of the Human Genome Project, there has been an assumption that we will be able to tailor individual treatments based on data from an individuals DNA. Meanwhile, others dream of using the masses of routinely collected clinical data to determine which treatments work and for whom through data mining. As individuals we are encouraged to record our health metrics using smartphones to optimise our lifestyles for better health.

Each of these aspects of data-driven healthcare has promise, but also problems. It is very difficult to reliably associate a disease or drug efficacy with a small number of testable gene alleles, and very easy to identify false positive gene associations. Routinely collected data is very difficult to make reliable inferences from in terms of cause and effect, because treatments are not randomly assigned to patients. Sophisticated analytics do not stop you needing to think about how your data was collected. Lifestyle optimisation via smartphones probably owes more to Silicon Valley's ideal of the hyper-optimised individual and a corporate desire for ever more personal data than any real health benefits beyond an increased motivation to exercise.

However, there are easy wins to be had from data. These are in prediction of future events that involve no medical intervention. It is difficult to predict how a drug will affect a patient, because you need to infer the drug's effect against a background of other potential causes. But it is much easier to tell if a patient arriving at the hospital for a specific operation will need to stay overnight; simply look at whether similar patients undergoing similar operations have done so. If this sounds exceptionally simple, that's because it is. However, the gains could be great. Hospitals routinely have to keep expensive beds available to deal with emergencies, or cancel planned operations to deal with unexpected bed shortages. A reliable system to estimate the length of patient stay after an operation with some accuracy would reduce the need for these expensive, time consuming and inconveniencing issues. On the ground staff already have a good sense for which patients will need to stay longer than others. However, in the maelstrom of an NHS hospital, anything that can help to systematise and automate the making and use of these estimates will reduce pressures on staff.

Exploring this possibility, we performed an analysis of data the NHS routinely collects for patients and procedures, such as age, year, day and surgery duration (see figure below), and used this to predict stay duration. Our results showed that a substantial portion of the variability in stay duration could be predicted from these data, which would translate to a significant saving for the NHS if generally applied and combined with current estimates of stay given by experts on the ground from their past experience. Note, importantly, we are not suggesting any intervention on the individual as a result of this analysis. For instance we make no judgement on whether the variation by day indicates anything important about treatment, only that this helps planners to know whats likely to come up next. This work is not about whether the NHS should operate a full weekend service!

The variation in predicted stay duration based on four possible indicators. Black line indicates median prediction, grey region is a 95% confidence interval. From Mann et al. (2016) Frontiers in Public Health

As with numerical weather forecasts, we envisage this supplementing and supporting existing human expert judgement, rather than replacing it - there are clearly facets of the patient that we cannot capture in a simple data analysis. This provides a minimal cost use of existing data, with little or no complicating causal issues, that could save the NHS money on a daily basis. The size of the NHS means that small gains can be amplified on a national scale, while NHS data provides an enormous potential resource. It may be in these unglamorous aspects of healthcare provision that data analytics has immediate potential.

A few weeks ago I read this paper on arXiv, purporting to use machine-learning techniques to determine criminality from facial expressions. The paper uses ID photos of "criminals and non-criminals" and infers quantifiable facial structures that separate these two classes. I had a lot of issues with it and was annoyed if not surprised when the media got excited by it. Last week I also saw this excellent review of the paper that echoes many of my own concerns, and in the spirit of shamelessly jumping on the bandwagon I thought I'd add my two-cents.

As someone who has dabbled in criminology research, I was pretty disturbed by the paper from an ethical standpoint. I think this subject, even if it is declared fair game for research, ought to be approached with the utmost caution. The findings simply appeal too strongly to some of our more base instincts, and to historically dangerous ideas, to be treated casually. The sparsity of information about the data is troubling, and I personally find the idea of publishing photos of "criminals and non-criminals" in a freely-available academic paper to be extremely unsettling (I'm not going to reproduce them here). The paper contains no information on any ethical procedures followed.

Aside from these issues, I was also disappointed from a statistical perspective, and in a way that is becoming increasingly common in applications of machine-learning. The authors of this paper appear not to have considered any possible issues with the causality of what they are inferring. I have no reason to doubt that the facial patterns they found in the "criminal" photos are distinct in some way from those in the "non-criminal" set. That is, I believe they can, given a photo, with some accuracy predict which set it belongs to. However, they give no consideration to any possible causal explanation for why these individuals ended up in these two sets, beyond the implied idea that some individuals are simply born to be criminals and have faces to match.

Is it not possible, for example, that those involved in law enforcement are biased against individuals who look a certain way? Of course it is. Its not like there isn't research on exactly this question. Imagine what would happen if you conducted this research in western societies: do you doubt that the distinctive facial features of minority communities would be inferred as criminal, simply because of well-documented police and judicial bias against these individuals? In fact, you need not imagine, this already happens: machine-learning software analyses prisoners risk of reoffending, and entirely unsurprisingly attributes higher risk to black offenders, even though race is not explicitly included as a factor.

If this subject matter was less troublesome, I would support the publication of such results as long as the authors presented the findings as suggesting avenues for future, more careful controlled studies. However, in this case the authors resolutely do not take this approach. Instead, they conclude that their work definitively demonstrates the link between criminality and facial features:

"We are the first to study automated face-induced inference
on criminality. By extensive experiments and vigorous
cross validations, we have demonstrated that via supervised
machine learning, data-driven face classifiers are able
to make reliable inference on criminality. Furthermore, we
have discovered that a law of normality for faces of noncriminals.
After controlled for race, gender and age, the
general law-biding public have facial appearances that vary
in a significantly lesser degree than criminals."

This paper remains un-reviewed, and let us hope it does not get a stamp of approval by a reputable journal. However, it highlights a problem with the recent fascination with machine-learning methods. Partly because of the apparent sophistication of these methods, and partly because many in the field are originally computer scientists, physicists or engineers, rather than statisticians, there has been a reluctance to engage with statistical rigour and questions of causality. With many researchers hoping to be picked up by Google, Facebook or Amazon, the focus has been on predictive accuracy, and on computational efficiency in the face of overwhelming data. Some have even declared that the scientific method is dead now that we have Big Data. As Katherine Bailey has said: "Being proficient in the use of machine learning algorithms such as neural networks, a skill that’s in such incredibly high demand these days, must feel to some people almost god-like ".

This is dangerous nonsense, as the claim to infer criminality from facial features shows. It is true that Big Data gives us many new opportunities. In some cases, accurate prediction is all we need, and as we have argued in a recent paper, prediction is easy, cheap and unproblematic compared to causal inference. Where simple predictions can help, we should go ahead. We absolutely should be bringing the methods and insights of machine-learning into the mainstream of statistics (this is a large part of what I try to do in my research). Neil Lawrence has said that Neural Networks are "punk statistics", and by God statistics could do with a few punks! But we should not pretend that simply having a more sophisticated model, and a huge data set, absolve us of the statistical problems that have plagued analysts for centuries when testing scientific theories. Our models must be designed precisely to account for possible confounding factors, and we still need controlled studies to carefully assess causality. As computer scientists should know: garbage in, garbage out.

XKCD:793

This is not a plea for researchers to 'stay in their lane'. I think criminology and statistics both need fresh ideas, and many of the smartest people I know work in machine-learning. We should all be looking for new areas to apply our ideas in. But working in a new field comes with some responsibility to learn the basic issues in that area. Almost everyone in biology or social science has a story about a physicist who thought they could solve every problem in a new field with a few simple equations, and I don't want data scientists to do the same thing. I fear that if modern data science had been invented before the discovery of the Theory of Gravity, we would now have computers capable of insanely accurate predictions of ballistics and planetary motions, and absolutely no idea how any of it really worked.

Prawns and Probability

Tuesday, December 20, 2016

Cheap wins from data in healthcare

Monday, December 5, 2016

Machine-learning doesn't give you a free pass