AI Bias & Fairness
Created 2 May 2025
safetyethicsbiasfairnessdiscrimination
AI Bias & Fairness
What is it?
AI bias refers to systematic errors in AI systems that result in unfair outcomes — typically reflecting or amplifying existing societal prejudices around race, gender, age, disability, or other protected characteristics.
How Bias Enters AI Systems
1. Training Data Bias
- Historical bias: Data reflects past discrimination (e.g., hiring data from a biased company)
- Representation bias: Some groups underrepresented in training data
- Measurement bias: Proxies that correlate with protected characteristics
- Selection bias: Data collected from non-representative sample
2. Algorithm Design Bias
- Objective function: What you optimise for can exclude fairness
- Feature selection: Including features that are proxies for protected characteristics (e.g., postcode → race)
- Aggregation bias: One model for all groups may serve majority well but minorities poorly
3. Deployment Bias
- Usage context: System used in ways not anticipated by designers
- Feedback loops: Biased predictions influence future data (e.g., predictive policing → over-policing → more arrest data)
- Evaluation bias: Testing only on majority groups
Famous Cases
| Case | What happened |
|---|---|
| Amazon hiring tool (2018) | CV screening penalised words like “women’s” |
| COMPAS recidivism | Higher false positive rate for Black defendants |
| Healthcare algorithm | Systematically underestimated needs of Black patients |
| Image generation | Reinforced stereotypes (nurses as women, CEOs as men) |
| Facial recognition | Much higher error rates for darker-skinned women |
Definitions of Fairness (They Conflict!)
| Definition | Meaning |
|---|---|
| Demographic parity | Equal positive rates across groups |
| Equalised odds | Equal true/false positive rates across groups |
| Individual fairness | Similar people get similar outcomes |
| Counterfactual fairness | Outcome wouldn’t change if protected attribute changed |
Critical insight: It’s mathematically proven that you cannot satisfy all fairness definitions simultaneously (Chouldechova, 2017; Kleinberg et al., 2016). You must choose which trade-offs to make.
Mitigation Approaches
Pre-processing
- Rebalance training data
- Remove or transform biased features
- Synthetic data generation for underrepresented groups
In-processing
- Adversarial debiasing (train to not predict protected characteristics)
- Fairness constraints during optimisation
- Multi-objective training
Post-processing
- Calibrate thresholds per group
- Reject option (abstain when uncertain for disadvantaged groups)
Why It Matters
- Legal risk: EU AI Act, US civil rights law, UK Equality Act all impose obligations
- Reputation: Biased AI = PR disaster and loss of trust
- Harm: Real people are denied jobs, loans, healthcare, freedom based on biased systems
- Obligation: If you deploy AI, you’re responsible for its impacts
Resources
- “Gender Shades” (Buolamwini & Gebru, 2018) — Facial recognition bias study
- “Fairness and Machine Learning” (Barocas, Hardt, Narayanan) — Free textbook
- AI Fairness 360 (IBM) — Open-source bias detection toolkit
- Google’s PAIR initiative — Responsible AI resources
- EU AI Act — Legal requirements around bias