Why Bias in AI is a Problem & Why Business Leaders Should Care (Fairness Series Part 1)

Written by

Avoiding bias in AI and achieving fairness

In November 2019 a Danish Apple Card user uncovered that Apple’s AI algorithm granted him 20 times the credit limit that his wife received. This disparity came as a major surprise as the couple shared assets and she actually had a higher credit score than he did. Apple and Goldman Sachs, who partnered on this financial product, repeatedly assured that this wasn’t a case of discrimination. But what started with a viral tweet brought more and more cases of bias to the surface – and ultimately led to an investigation by the New York State Department of Financial Services.

Yet another high-profile issue was investigated by ProPublica: An algorithm used by the US’ criminal justice system that predicts the risk of defendants to re-offend once they have served their sentences. Based on the algorithm’s results, judges determine which defendants are eligible for probation or treatment programs. In ProPublica’s study, the authors demonstrate that the algorithm is biased with respect to ethnicity:

Afro-Americans are almost twice as likely as whites to be labeled a higher risk but not actually re-offend. It [the algorithm] makes the opposite mistake among whites: They are much more likely than blacks to be labeled lower risk but go on to commit other crimes

While Artificial Intelligence has remarkable potential to analyze and identify patterns even in very large datasets and to help humans to make more informed decisions, the examples mentioned above were by far not the only incidents where customers or researchers are concerned about machine learning models exhibiting discriminatory behavior based on gender, race or sexual orientation.

Read the other parts of the series:

But Why Is AI Biased?

Simply spoken, there is not one root cause for bias in AI – and that is why it is so difficult to get rid of it. One of the major problems is insufficient training data where some demographic groups are missing or underrepresented. For example, one study by MIT researchers revealed that facial recognition technology had higher error rates for minorities – particularly if they were female. Another study found, that a facial recognition system was 99% accurate in detecting male faces, but was only capable of correctly recognizing a black woman two out of three times. This difference in performance originated from the dataset that was used to train the model: there were 75% male faces but only 25% female ones in the training sample and 80% of the total amount of images showed white persons. Naturally, the algorithm better learned to identify those categories where significantly more data was available.

Similar issues can be observed when AI is used for recruiting purposes without making sure that a substantial amount of training data, for a diverse group of people, is used during model training. Thus concerns have been raised, that for example remote video interview software that evaluates employability based on facial expression, speech patterns, and tone of voice could unfairly disqualify disabled people. Possibly even more concerning (and potentially fatal) are scenarios where AI is applied for healthcare. One example is the British health app “Babylon” that was accused of putting female users at risk. It suggested that sudden pain in the left arm and nausea may be due to depression or having a panic attack and advised women to go see a doctor in the next few days. In contrast, males showing the same symptoms were advised to immediately visit an emergency department based on the diagnosis of a possible heart attack.

According to Prof. Dr. Sylvia Thun, director of eHealth at Charité of the Berlin Institute of Health, “there are huge data gaps regarding the lives and bodies of women”. In a recent Forbes interview, she explained that medical algorithms are oftentimes based on U.S. military data – an area where women, in some cases, only represent 6% of the total personnel. Thus she emphasized the importance of making sure that medical apps take relevant data not only from men but also from women into account.

Bias In Artificial Intelligence Is A Human-made Problem

However, AI bias is not always a consequence of limited training data. To a certain extent, all humans carry (un)conscious biases and behave accordingly. Thereby our human biases find their way into the historical data that is used to train algorithms – and thus it is not surprising that they get picked up by machine learning models. An example for this was at one point Amazon’s recruiting algorithm that learned that – historically – the majority of technical roles was filled by males and thus penalized a resume if it included the word “women”. Or Google’s discriminatory job ads algorithm that disproportionately showed high-paying job ads to men but not to women.

Another example comes from the U.S. health care system where AI is deployed to guide healthcare decisions. A study found that a widely used algorithm discriminated against black people by estimating the level of care that is needed based on health costs. Due to the fact that more money is spent on white patients, the algorithm concluded that black patients are healthier and don’t require the same amount of extra care.

Avoiding Artificial Intelligence Won’t Eliminate Discrimination

Human bias in historical data is an issue that needs to be addressed when developing AI algorithms. But it also makes apparent that simply refraining from applying AI in our day-to-day lives wouldn’t solve our society’s problem of discrimination and unfair treatment of minorities. Long before AI found its way into today’s economy, researchers documented cases of injustices due to race, gender, or sexual orientation. From hiring managers, that invited people with white- instead of black-sounding names 50% more often to a job interview. To women, who in general are 47% more likely to suffer from a serious injury and 17% more likely to die in a car accident, because seatbelts and other safety features in cars were designed based on crash test dummies with male physiques.

In fact, quite the opposite might be true: As AI algorithms require us to be completely clear and to precisely define which outcomes we would consider fair and ethically acceptable, there might be the inherent potential that this new technology will enable us to better mitigate bias in our society. This rationale is also shared by Sendhil Mullainathan, Professor at the University of Chicago, who authored several studies on bias (in people and in AI). In a recent New York Times article he stated that:

Changing algorithms is easier than changing people: software on computers can be updated; the “wetware” in our brains has so far proven much less pliable. None of this is meant to diminish the pitfalls and care needed in fixing algorithmic bias. But compared with the intransigence of human bias, it does look a great deal simpler.

Why Should Business Leaders Care About Bias in AI?

As a society, we should strive to develop AI technology that is effective and fair for everyone. But also businesses – which will become increasingly more reliant on machine learning algorithms – will benefit if they proactively tackle bias in AI. One of the more obvious advantages is, that the mere idea of an underlying algorithm being biased could be enough to turn customers against a product or a company. Moreover, having researchers expose the discriminatory nature of a proprietary AI application constitutes a significant reputational risk that could be hard to recover from.

Another point to consider is that developing an algorithm that accurately performs on the whole spectrum of human diversity is also much more likely to deliver superior value to a broader and varied group of potential customers. But not only customers would benefit from unbiased AI. According to Gartner, “By 2022, 85% of AI projects will deliver erroneous outcomes due to bias in data, algorithms, or the teams responsible for managing them. This is not just a problem for gender inequality – it also undermines the usefulness of AI”.

Ultimately, the ongoing discussion about anti-bias regulations, the issued Ethical AI guidelines and the call for AI certifications could be motivating factors to approach this topic proactively. Especially since the European Commission emphasized the importance of having “requirements to take reasonable measures aimed at ensuring that [the] use of AI systems does not lead to outcomes entailing prohibited discrimination.” in their white paper on Artificial Intelligence, which was just released in February 2020.

This was part 1 of our Fairness Series. Tomorrow’s post will cover 10 reasons why AI algorithms are biased and what you can do about it. On Wednesday, we will take a deep-dive into the definition of fairness and discuss how it can be balanced with other values. We are already very much looking forward to Thursday, where we will introduce the brand new concept of Fair Synthetic Data (which is bias-corrected, fully anonymous data that is free to use and innovate with). This is followed by Friday’s post, which will be the technical centerpiece of this series. There you will learn more about Fair Synthetic Data Generation – and if you are interested, we will also share two fair synthetic datasets with you to experiment on. So make sure that you don’t miss out!