GDPR & AI
GDPR & AI
The GDPR came into force in 2018 — before the AI boom. It wasn’t written with large language models in mind. But its principles apply directly, and the tensions between data protection and AI development are some of the most contested legal questions in Europe right now.
If you’re deploying AI that touches personal data of EU residents — and most AI does — GDPR is your first legal constraint. The EU AI Act adds a second layer. Understanding how they interact is essential.
The Core Tensions
1. Data Minimisation vs AI’s Hunger for Data
GDPR says: Only collect and process data that’s necessary for a specific purpose (Art. 5(1)(c)).
AI needs: Massive datasets. More data = better models. Often scraping broadly and filtering later.
The tension: How do you train on “only what’s necessary” when you don’t know what’s necessary until after training?
Current thinking: Legitimate interest (Art. 6(1)(f)) may justify training if proportionality is demonstrated. But this is untested at scale in court.
2. Purpose Limitation vs General-Purpose Models
GDPR says: Data collected for one purpose can’t be used for another without new consent (Art. 5(1)(b)).
AI reality: Foundation models are general-purpose by design. Training data collected under one pretence ends up enabling capabilities nobody anticipated.
The tension: If you scrape data for “improving search” and then use it to train a model that writes code, generates images, and answers medical questions — have you violated purpose limitation?
3. Right to Explanation vs Black Box Models
GDPR says: Data subjects have the right to “meaningful information about the logic involved” in automated decisions (Art. 13-15). Solely automated decisions with legal effects have special restrictions (Art. 22).
AI reality: Neural Networks are not easily explainable. A model can’t tell you why it made a specific decision in human-interpretable terms.
The tension: What counts as “meaningful information”? The model architecture? The training data? A post-hoc explanation that may not reflect the actual reasoning?
4. Right to Erasure vs Model Training
GDPR says: Individuals can request their data be deleted (Art. 17).
AI reality: Once data is used to train a model, it’s encoded in the weights. You can’t “delete” a specific person’s contribution from a trained model without retraining.
The tension: If someone requests erasure of data that was used in training, do you need to retrain the entire model? Current guidance says “reasonable measures” — but what’s reasonable when retraining costs millions?
Key GDPR Articles for AI
| Article | What it says | AI relevance |
|---|---|---|
| Art. 5 | Data processing principles | Minimisation, purpose limitation, accuracy — all challenged by AI |
| Art. 6 | Lawful basis | Consent or legitimate interest for training? |
| Art. 13-14 | Transparency | Must inform people about AI processing |
| Art. 15 | Right of access | Can someone ask what data about them was used? |
| Art. 17 | Right to erasure | “Machine unlearning” problem |
| Art. 22 | Automated decision-making | Restrictions on fully automated decisions with legal effects |
| Art. 35 | DPIA | Data Protection Impact Assessment — likely required for most AI systems |
| Art. 9 | Special categories | Health, biometric, political data — extra restrictions, relevant to many AI use cases |
Enforcement So Far
| Date | Action | Significance |
|---|---|---|
| March 2023 | Italy (Garante) bans ChatGPT | First regulatory action against an LLM. Alleged lack of legal basis, no age verification. Restored after OpenAI added transparency measures. |
| 2023-24 | Multiple DPAs investigate training data | CNIL (France), AP (Netherlands), AEPD (Spain) all opened investigations into LLM training data lawfulness. |
| 2024 | EDPB opinion on AI & GDPR | European Data Protection Board issued guidance on applying GDPR to AI. Key point: legitimate interest possible but requires thorough balancing test. |
Practical Guidance
If you’re deploying AI in the EU:
- Conduct a DPIA (Art. 35) — Almost certainly required for any AI processing personal data at scale.
- Identify your lawful basis (Art. 6) — Consent is hard at scale. Legitimate interest requires a documented balancing test.
- Be transparent (Art. 13-14) — Inform users that AI is processing their data. What for. What rights they have.
- Handle Art. 22 carefully — If your AI makes decisions with legal or significant effects, you need human oversight and an opt-out mechanism.
- Have an erasure strategy — Even if you can’t retrain, document your approach to handling deletion requests.
- Watch the DPAs — Guidance is evolving rapidly. Ireland (DPC), France (CNIL), and Italy (Garante) are the most active.
How GDPR and the AI Act Interact
They’re complementary but distinct:
- GDPR protects personal data, regardless of the technology used
- EU AI Act regulates AI systems, regardless of whether personal data is involved
Both can apply simultaneously. A facial recognition system in a public space? That’s high-risk under the AI Act AND requires GDPR compliance. Double enforcement. Double fines (potentially).
The designated AI Act authorities will need to coordinate with DPAs. In some countries, it may be the same body.
Go Deeper
- EU AI Act — The AI-specific regulation
- EU Country Codes & Authorities — Who enforces what, country by country
- Legal & Compliance — Back to the legal section
- AI Bias & Fairness — Bias is also a GDPR issue (inaccuracy, discrimination)
- Training & Fine-Tuning — The process that creates the data protection tensions