[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Models

wojcech@alien.top · 10 months ago

[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Models

gwern@alien.top · 10 months ago

No, I still think it’s not that surprising even taking it as a whole. Humans memorize things all the time after a single look. (Consider, for example, image recognition memory.) If a NN can memorize entire datasets after a few epoches using ‘a single small noisy step of gradient descent over 1-4 million tokens’ on each datapoint once per epoch, why is saying that some of this memorization happens in the first epoch so surprising? (If it’s good enough to memorize given a few steps, then you’re just haggling over the price, and 1 step is well within reason.) And there is usually not that much intrinsic information in any of these samples, so if a LLM has done a good job of learning generalizable representations of things like names or phone numbers, it doesn’t take up much ‘space’ inside the LLM to encode yet another slight variation on a human name. (If the representation is good, a ‘small’ step covers a huge amount of data.)

Plus, you are overegging the description: it’s not like it’s memorizing 100% of the data on sight, nor is the memorization permanent. (The estimates from earlier papers are more like 1% get memorized at the first epoch, and OP estimates they could extract 1GB of text from GPT-3/4, which sounds roughly consistent.) So it’s more like, ‘once every great once in a while, particularly if a datapoint was very recently seen or simple or stereotypical, the model can mostly recall having seen it before’.

zalperst@alien.top · 10 months ago

I appreciate your position, but I don’t think your intuition holds here, for instance biological neural nets very likely use a qualitatively different learning algorithm than back propagation.

zalperst@alien.top · 10 months ago

I appreciate that it’s possible to find a not-illogical explanation (logical would entail a real proof), but it remains surprising to me.

ThirdMover@alien.top · 10 months ago

Humans memorize things all the time after a single look.

I think what’s going on in humans there is a lot more complex than something like a single SGD step updating some weights. Generally if you do memorize something you replay it in your head consciously several times.