wojcech@alien.topB to Machine Learning@academy.gardenEnglish · 10 months ago[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Modelsarxiv.orgexternal-linkmessage-square30fedilinkarrow-up11arrow-down10cross-posted to: hackernews@lemmy.smeargle.fanshackernews@derp.foomachinelearning@lemmit.online
arrow-up11arrow-down1external-link[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Modelsarxiv.orgwojcech@alien.topB to Machine Learning@academy.gardenEnglish · 10 months agomessage-square30fedilinkcross-posted to: hackernews@lemmy.smeargle.fanshackernews@derp.foomachinelearning@lemmit.online
minus-squarecegras@alien.topBlinkfedilinkEnglisharrow-up1·10 months agoWhat is the size of ChatGPT or the biggest LLMs compared to the dataset? (Not being rhetorical, genuinely curious)
minus-squarezalperst@alien.topBlinkfedilinkEnglisharrow-up1·10 months agoTrillions of tokens, billions of parameters
minus-squareStartledWatermelon@alien.topBlinkfedilinkEnglisharrow-up1·10 months agoGPT-4: 1.76 trillion parameters, about 6.5* trillion tokens in the dataset. could be twice that, the leaks weren’t crystal clear. The above number is more likely though.
What is the size of ChatGPT or the biggest LLMs compared to the dataset? (Not being rhetorical, genuinely curious)
Trillions of tokens, billions of parameters
GPT-4: 1.76 trillion parameters, about 6.5* trillion tokens in the dataset.