Trial and error: the human flaws in machine learning

Technology

It turns out ML algorithms need a dose of common sense and intuition when it comes to making good decisions, writes Zita Goldman

Nobel-laureate psychologist Daniel Kahneman has written a lot about the biases of the human mind. In his seminal book Thinking Fast and Slow, he presents readers with a typology of heuristics – or shortcuts – that people tend to take when making quick decisions. For example, we can often overestimate the frequency of a feature or the probability of an event if we can easily recall examples of them. The oft-cited thought experiment in Kahneman’s book involves a woman named Linda, who is “deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.” When asked whether it was more probable that Linda was (a) a bank teller, or (b) a bank teller active in the feminist movement, most people chose (b) – even though it was statistically less likely for Linda to be these two quite specific things, rather than just one of them.

The list of cognitive biases is ever expanding, yet also so intrinsic to our fast, intuitive and effortless cognitive system that, says Kahneman, the only way to keep them under control is to monitor them with slow, logical, energy-intensive processes and address any glaring anomalies. Kahneman found that our brains actually have two separate operating systems – a fast one and a slow one – which operate in tandem, with the second, though far less influential in ultimate decision making, acting as a brake on the dominant first.

But where does this fit into AI and machine learning? Those who trust technology more than humans believe that the most efficient way of eliminating the flaws in our thinking is to rely on disinterested, even-handed algorithms to make predictions and decisions, rather than inconsistent, prejudiced humans. But are the algorithms we use in artificial intelligence (AI) today really up to scratch? Or do machines have their own fallibilities when it comes to preconceptions?

Programmed to perform

Although there is general consensus that AI is designed to mimic human intelligence in machines, the aspects of human thinking that different models try to emulate can be different. Back in the 1980s, developers tried to imitate the human ability to both reason and make reasonable assumptions with the help of logic.

Later, in the 1990s, with the availability of torrential amounts of data and an explosion in computing power, machine learning (ML) stole the AI show. The drive to imitate humans’ symbolic, “slow” reasoning had fallen by the wayside.

The reason for this is that ML’s predictions are based on correlations observed among vast quantities of data. Its algorithms learn using the old-fashioned trial and error principle, altering the weight given to each piece of information fed into the system.

ML algorithms can be hugely successful when applied to closed systems – games such as chess and go for example. They can unlock the power of unstructured data and predict preferences and emerging trends. Their attention never flags, and they have no concept of things such as tiredness or emotion that could skew their decision making.

Where their limitation lies, however, is in transferability. In order to be robust, they need to be applied to the kind of data that they have been trained and tested on. Which brings us to the rise of machine reasoning (MR), a potential new AI trend that aims to create models of logical techniques such as induction and deduction. Machine learning is very good at doing highly specific things, but it can’t solve new problems, therefore is completely incapable of mimicking the flexibility of the human brain.

To borrow an analogy from Yoshua Bengio, a computer scientist renowned for his contribution to artificial neural networks and deep learning, if you train an ML algorithm in a room with the lights on, you’ll need to create a new one for the very same room if the lights go off.

Despite its potential to eliminate human cognitive flaws, in recent years, we’ve seen several examples in recruitment, credit scoring and criminal courts of how human prejudice can be baked into machine learning software as an unintended consequence.

But these algorithms have their own innate biases too, the most typical being “overfitting”. As digital technology writer Adam Greenfield explains using a car-based analogy, if all the Chevrolet Camaros shown to an algorithm designed to distinguish between three specific car brands happen to be red, it will erroneously “think” that redness is a definitive feature of a Camaro, rather than an independent variable.

Algorithmic bias may result in examples such as the one quoted in a study by Tom Taulli, author of Artificial Intelligence Basics. Machine learning software used at a hospital to predict the risk of death from pneumonia, for example, arrived at the conclusion that patients with asthma were less likely to die from it than patients without – a rather counterintuitive conclusion that common sense can easily overrule. (The algorithm didn’t account for the fact that people with asthma typically receive faster and more intensive care, hence the lower mortality rate in the training data.)

But current ML algorithms don’t have a common sense – or “slow” – system to detect anomalies or reflect on and deal with their own biases. (Although Cyc, an ongoing 36-year-old project aiming at creating a “common sense engine”, may still come to fruition one day.) It’s still left to flawed humans to detect ML algorithms’ racial, gender and programming biases or, in many cases, learn about them the hard way.

The occasionally scandalous failures of autonomous decision-making by machine learning without any or insufficient human control serve as constant reminders that the technology can only assist and augment human decisions until reasoning and a certain “common sense” is embedded into the system. This, however, doesn’t take away from the merits of ML and the boost the technology can give to RoI through recommendations, opinion mining or personal modelling – areas allowing a more generous margin for error. Overstretching ML’s capabilities will underwhelm users and thwart adoption. But making the most of what it really excels at will build trust towards it.

AI & Automation