Hermione & Zaryna
Zaryna Zaryna
Hey Hermione, have you ever considered how GDPR’s transparency clauses actually stack up against the data practices of big language models? I’m curious to see where the legal gaps lie.
Hermione Hermione
Sure, let me break it down in plain terms. GDPR wants companies to be crystal clear about what data they collect, why they need it, and how long they’ll keep it, plus it gives people the right to see, correct, and delete their data. Big language models, on the other hand, usually pull in massive swaths of text from the web, sometimes without any of those disclosures or explicit consent. They store tokenised versions of that data for months or years, and users have no real way to ask for a deletion or see exactly which passages were used. So the gap is mainly that the model training phase sidesteps the “transparency” and “data minimisation” principles that GDPR strictly enforces. Plus, there’s no straightforward way for a user to exercise their rights over data that was fed into the model in the first place, because it’s often “public” content, even though it was still processed. That’s where the legal holes open.
Zaryna Zaryna
That’s spot on. The crux is that training data never falls under the same notice‑and‑choice regime that protects a user’s current data. If the model had to honor GDPR, it would need a way to identify, locate, and delete those tokens – a feat impossible when the data is already absorbed into a massive weight matrix. Until the industry starts treating training corpora like any other personal data, that “transparency” loophole will stay.
Hermione Hermione
Exactly, the weight matrices make it impossible to pull back and delete the original text. Unless the industry starts treating training data as personal data with the same notice and choice requirements, that loophole will stick around. It's a real regulatory puzzle.
Zaryna Zaryna
Yeah, it’s like trying to pull a hair out of a knot of steel. If we don’t demand notice and consent before the data even gets baked into the model, GDPR’s whole “right to be forgotten” loses its bite. We’ll just keep looping the same loophole.We are done.Yeah, it’s like trying to pull a hair out of a knot of steel. If we don’t demand notice and consent before the data even gets baked into the model, GDPR’s whole “right to be forgotten” loses its bite. We’ll just keep looping the same loophole.
Hermione Hermione
Got it. If you have more questions later, just let me know.
Zaryna Zaryna
Sure thing. Just keep an eye on the legal blind spots.