Algorithmic bias is a crucial issue in AI ethics. It occurs when computer systems produce unfair outcomes, often due to flawed training data, human biases, or feedback loops. This can lead to in areas like facial recognition, hiring, and predictive policing.
There are several types of algorithmic bias, including , , and . These biases can stem from historical data, societal prejudices, or lack of diversity in AI teams. Recognizing and addressing these issues is essential for creating fair AI systems.
Types of algorithmic bias
Systematic errors creating unfair outcomes
Top images from around the web for Systematic errors creating unfair outcomes
Helge Scherlund's eLearning News: How machine learning engineers can detect and debug ... View original
Is this image relevant?
Research summary: Algorithmic Bias: On the Implicit Biases of Social Technology | Montreal AI ... View original
Is this image relevant?
Algorithmic discrimination. One of the main challenges for social progress in the 21st century ... View original
Is this image relevant?
Helge Scherlund's eLearning News: How machine learning engineers can detect and debug ... View original
Is this image relevant?
Research summary: Algorithmic Bias: On the Implicit Biases of Social Technology | Montreal AI ... View original
Is this image relevant?
1 of 3
Top images from around the web for Systematic errors creating unfair outcomes
Helge Scherlund's eLearning News: How machine learning engineers can detect and debug ... View original
Is this image relevant?
Research summary: Algorithmic Bias: On the Implicit Biases of Social Technology | Montreal AI ... View original
Is this image relevant?
Algorithmic discrimination. One of the main challenges for social progress in the 21st century ... View original
Is this image relevant?
Helge Scherlund's eLearning News: How machine learning engineers can detect and debug ... View original
Is this image relevant?
Research summary: Algorithmic Bias: On the Implicit Biases of Social Technology | Montreal AI ... View original
Is this image relevant?
1 of 3
Algorithmic bias refers to systematic errors in computer systems that create unfair outcomes, such as privileging one arbitrary group over others
These biases can emerge from various sources, including problems with training data, human biases, and feedback loops
Examples of unfair outcomes include facial recognition systems performing poorly on people with darker skin, or resume screening tools discriminating against women
Specific types of algorithmic bias
Selection bias occurs when the data used to train an AI system does not accurately reflect the population of interest, leading to skewed outcomes
For instance, if a medical diagnosis AI is trained mainly on data from white patients, it may perform poorly for patients of other races
Measurement bias arises when the way data is collected or labeled causes certain data to be considered more important, skewing the outcomes
As an example, if historical hiring data reflects discriminatory practices, an AI system trained on this data will learn to perpetuate these biases
Confounding bias happens when an AI system picks up on and amplifies spurious correlations rather than true causal relationships
A classic example is an AI system concluding that ice cream sales cause drowning, when in fact both are correlated with hot weather
occurs when important features are left out of the data used to train an AI, leading it to rely on other less relevant features
For instance, if data on loan repayment doesn't include employment status, an AI might incorrectly use race as a proxy, leading to bias
Bias in training data
Historical and societal biases in data
AI systems learn patterns and correlations from the data they are trained on. If this training data contains biases, those biases will be learned and reproduced by the AI system
Historical data used to train AI often contains societal biases around factors like race and gender. AI trained on this data will pick up and perpetuate these biases
For example, Amazon had to scrap an AI recruiting tool that discriminated against women because it was trained on historical hiring data reflecting human bias
Lack of diverse representation in training data means the AI will perform poorly for underrepresented groups. This is especially problematic in facial recognition systems
Research has shown that leading facial recognition tools have significantly higher error rates for women and people of color due to skewed training data
Issues with data collection and labeling
Imbalanced datasets, where some groups are over- or under-represented compared to their real-world frequencies, lead to skewed AI outcomes
For instance, if medical data comes disproportionately from healthier and wealthier patients, AI diagnostic tools may not work well for other populations
Mislabeled data, where humans incorrectly tag data during the training process, leads to AI learning the wrong associations and exhibiting biased performance
An example would be a computer vision system learning to classify images of kitchens as "women" and offices as "men" based on gender biases in labeled training photos
Careful curation of training datasets for diversity and accurate labeling is crucial for mitigating bias, but is often difficult and resource-intensive
Facial recognition datasets aiming for diversity have run into issues with consent and privacy when scraping online images of underrepresented groups
Human biases in AI
Conscious and unconscious biases of developers
The humans who design AI systems and choose what data to train them on have conscious and unconscious biases that can become built into the technology
Biases around age, gender, race and other characteristics can skew how developers frame problems for AI systems to solve
For example, developers of predictive policing tools may focus on optimizing for high arrest numbers vs. community wellbeing due to biases about crime
Confirmation bias, anchoring bias, in-group bias and other cognitive biases can influence the decisions of AI developers throughout the design process
An example of confirmation bias would be a developer who believes AI is objective testing their systems in ways that confirm this belief and overlooking evidence of bias
Lack of diversity in AI teams
Lack of diversity in AI teams means that developers' blind spots around bias are more likely to go unnoticed and unaddressed
The AI field is currently dominated by white and Asian men, especially in leadership roles at top companies
A 2019 study found that only 18% of authors at major AI conferences are women, and more than 80% of AI professors are men
Homogeneous teams are less likely to recognize and question biased assumptions, leading to biased AI products
With more diverse voices involved in the development process, issues of bias are more likely to be anticipated and avoided
Increasing diversity in the AI field is a crucial step for identifying and mitigating algorithmic bias, but significant barriers remain
Hostile workplace cultures, unequal access to resources and mentoring, and biased hiring and promotion practices all hinder diversity efforts
Feedback loops and bias amplification
How AI outputs become future inputs
Feedback loops occur when the outputs of an AI system are used as inputs, directly or indirectly, in the future. This causes the system's predictions to influence its later behavior
Bias in an AI system can be amplified over time through feedback loops, as the model's skewed outputs are fed back into it and used to re-train the system
For example, if an AI resume screening tool favors male applicants, more men will be hired, and their performance data will be fed back into the AI, amplifying its bias over time
Feedback loops can be subtle and hard to identify, as there are often several steps between an AI system's outputs and the way those outputs make their way back into the system as future inputs
In a social media newsfeed, biased click and share data influences what content is boosted, which influences future user behavior in hard-to-trace ways
Examples of bias-amplifying feedback loops
Predictive policing is an example of a feedback loop that can amplify bias. Crime predictions lead to more policing in certain areas, which leads to more crime detected there, which feeds back into the crime prediction algorithms
This can create a "runaway feedback loop" where policing is increasingly concentrated in overpoliced communities, regardless of true crime rates
Recommendation engines, as used by online platforms, can create filter bubbles that trap users in feedback loops of being shown only content that matches their existing preferences
This can reinforce and amplify biases, as with YouTube's algorithm being more likely to suggest progressively more extreme political content to users who start out watching partisan videos
Feedback loops in AI systems that make real-world decisions (e.g. loan approval, medical diagnosis) are especially dangerous as outputs directly influence people's lives
If a medical AI incorporates biased assumptions that certain patients are less likely to comply with treatment, those patients may be undertreated, leading to worse outcomes that feed back into and amplify the AI's bias