#Memorization

N-gated Hacker Newsngate
2025-12-24

Ah, the age-old quest for finding the perfect hack! 🤔 Well, here comes the article, promising to revolutionize your brain with a sprinkle of and . ✨ Just remember, if this method was truly foolproof, would be running Apple by now, not blogging about it. 😂
gwern.net/spaced-repetition

2025-12-04

As always: #OpenData persistently available at:
Du, K. (2025). Reconstructing Shuffled Text (Derived Text Formats) [Data set]. Zenodo. doi.org/10.5281/zenodo.17198425
#CLS #CCLS25 #DTF #LiteraryComputing #LLM #Memorization

Arie van Deursen 🇪🇺🇳🇱avandeursen@mastodon.acm.org
2025-11-11

In our own work, we researched memorization in language models for code and ways to let them regurgitate training data:

> From the training data that was identified to be potentially extractable we were able to extract 47% from a CodeGen-Mono-16B code completion model.

> We also observe that models memorise more, as their parameter count grows, and that their pre-training data are also vulnerable to attack

dl.acm.org/doi/abs/10.1145/359

#memorization #atemlos

Arie van Deursen 🇪🇺🇳🇱avandeursen@mastodon.acm.org
2025-11-11

Urteil GEMA gegen Open AI:

> Sowohl durch die Memorisierung in den Sprachmodellen als auch durch die Wiedergabe der Liedtexte in den Outputs des Chatbot lägen Eingriffe in die urheberrechtlichen Verwertungsrechte vor

justiz.bayern.de/gerichte-und-

#atemlos #openai #copyright #memorization #gema #chatgpt

Hacker Newsh4ckernews
2025-11-07
N-gated Hacker Newsngate
2025-06-13

The New York Times thinks a turtle poem will "win your heart" 🐢💔—because nothing screams "captivating" like slow-moving reptiles and deep dives into poetic gravity. 🎼✨ Meanwhile, they offer a to help memorize it, as if anyone is clamoring to recite turtle verses at parties. 🎉📜
nytimes.com/interactive/2025/0

Erik JonkerErikJonker
2025-06-07

Interesting, "GPT-style models have a fixed memorization capacity of approximately 3.6 bits per parameter."
venturebeat.com/ai/how-much-in

2025-06-06

How much information do LLMs really memorize? Now we know, thanks to Meta, Google, Nvidia and Cornell https://venturebeat.com/ai/how-much-information-do-llms-really-memorize-now-we-know-thanks-to-meta-google-nvidia-and-cornell/ #AI #memorization #copyright

Text Shot: Jack Morris, the lead author, explained via the social network X that “training on more data will force models to memorize less per-sample.”

These findings may help ease concerns around large models memorizing copyrighted or sensitive content.

If memorization is limited and diluted across many examples, the likelihood of reproducing any one specific training example decreases. In essence, more training data leads to safer generalization behavior, not increased risk.
2025-06-06

How much information do LLMs really memorize? Now we know, thanks to Meta, Google, Nvidia and Cornell venturebeat.com/ai/how-much-in #AI #memorization #copyright

Text Shot: Jack Morris, the lead author, explained via the social network X that “training on more data will force models to memorize less per-sample.”

These findings may help ease concerns around large models memorizing copyrighted or sensitive content.

If memorization is limited and diluted across many examples, the likelihood of reproducing any one specific training example decreases. In essence, more training data leads to safer generalization behavior, not increased risk.
WIST Quotations Has Moved!wist@my-place.social
2025-04-16

A quotation from Montaigne

I gladly return to the subject of the ineptitude of our education. Its goal has been to make us not good or wise, but learned; it has attained this goal. It has not taught us to follow and embrace virtue and wisdom, but has imprinted in us their derivation and etymology. We know how to decline virtue, if we cannot love it. If we do not know what wisdom is by practice and experience, we know it by jargon and by rote.
 
[Je retombe volontiers sur ce discours de l’ineptie de nostre institution : Elle a eu pour sa fin, de nous faire, non bons & sages, mais sçavans : elle y est arrivée. Elle ne nous a pas appris de suyvre & embrasser la vertu & la prudence : mais elle nous en a imprimé la derivation & l’etymologie. Nous sçavons decliner vertu, si nous ne sçavons l’aymer. Si nous ne sçavons que c’est que prudence par effect, & par experience, nous le sçavons par jargon & par cœur.]

Michel de Montaigne (1533-1592) French essayist
Essay (1578), “Of Presumption [De la Presomption], Essays, Book 2, ch. 17 (2.17) (1595) [tr. Frame (1943)]

Sourcing, notes, alternate translations: wist.info/montaigne-michel-de/…

#quote #quotes #quotation #qotd #montaigne #education #learning #meaning #memorization #morality #rote #school #understanding #virtue #wisdom

WIST Quotations Has Moved!wist@my-place.social
2025-03-12

A quotation from Montaigne

We readily inquire, “Does he know Greek or Latin?” “Can he write poetry and prose?” But what matters most is what we put last: “Has he become better and wiser?” We ought to find out not who understands most but who understands best. We work merely to fill the memory, leaving the understanding and the sense of right and wrong empty.
 
[Nous enquerons volontiers, Sçait-il du Grec ou du Latin ? escrit-il en vers ou en prose ? mais, s’il est devenu meilleur ou plus advisé, c’estoit le principal, & c’est ce qui demeure derriere. Il falloit s’enquerir qui est mieux sçavant, non qui est plus sçavant. Nous ne travaillons qu’à remplir la memoire, & laissons l’entendement & la conscience vuide.]

Michel de Montaigne (1533-1592) French essayist
Essay (1572-1578), “Of Pedantry [Du pedantisme]), Essays, Book 1, ch. 24 (1.24) (1595) [tr. Screech (1987), ch. 25]

Sourcing, notes, alternate translations: wist.info/montaigne-michel-de/…

#quote #quotes #quotation #Montaigne #comprehension #education #evaluation #improvement #learning #memorization #rubric #school #student #teaching #understanding #wisdom

WIST Quotations Has Moved!wist@my-place.social
2025-03-11

A quotation from William Feather

An education isn’t how much you have committed to memory, or even how much you know. It’s being able to differentiate between what you do know and what you don’t. It’s knowing where to go to find out what you need to know, and it’s knowing how to use the information once you get it.

William Feather (1889-1981) American publisher, author
(Attributed)

Sourcing, notes: wist.info/feather-william/1479…

#quote #quotes #quotation #application #competence #education #ignorance #knowledge #memorization #research

Andrew ShieldsAndrewShields@mas.to
2025-03-07

Counting to high numbers and reciting poems to suppress evil thoughts in Charles Dickens’s “Hard Times” (1854). #111Words #CharlesDickens #HardTimes #Poetry #Counting #Recitation #Memorization andrewjshields.blogspot.com/20

Rohingya Charity Organizationabdulwajed
2025-02-25

Help us repair our Center for children. We have only 4 days left to rebuild it, so the students can pray Tarawih and start memorizing Quran there, insha’Allah.

Please whatever you can.

Jazakallahu khairan.

Donate:
launchgood.com/v4/campaign/spo

Miguel Afonso Caetanoremixtures@tldr.nettime.org
2025-01-15

"To prevent AI models from memorizing their input, we know exactly one robust method: differential privacy (DP). But crucially, DP requires you to precisely define what you want to protect. For example, to protect individual people, you must know which piece of data comes from which person in your dataset. If you have a dataset with identifiers, that's easy. If you want to use a humongous pile of data crawled from the open Web, that's not just hard: that's fundamentally impossible.

In practice, this means that for massive AI models, you can't really protect the massive pile of training data. This probably doesn't matter to you: chances are, you can't afford to train one from scratch anyway. But you may want to use sensitive data to fine-tune them, so they can perform better on some task. There, you may be able to use DP to mitigate the memorization risks on your sensitive data.

This still requires you to be OK with the inherent risk of the off-the-shelf LLMs, whose privacy and compliance story boils down to "everyone else is doing it, so it's probably fine?".

To avoid this last problem, and get robust protection, and probably get better results… Why not train a reasonably-sized model entirely on data that you fully understand instead?"

desfontain.es/blog/privacy-in-

#AI #GenerativeAI #LLMs #SLMs #Privacy #DifferentialPrivacy #Memorization

2025-01-12

So, I got a card from a bank to be able to use the money I have stored there. So far, so good.

However, the card comes with a pin.

They tell you to memorize the pin and to destroy the piece of paper on which it came.

What bloody world are they living in??? One where Santa, Orcs, and Wizards are real?

How the fuck am I going to memorize this pin, on top of all the *other* pins I'm supposed to memorize.

I ask how I can change it.

They told me it cannot be changed.

:headache: :why:

Okay, okay. I'm just going to fucking ignore the bank's stupid security recommendations. No, I'm not writing it down. I have my system, but good grief. Wouldn't it be nice if they joined us, you know, in reality instead of living in a fantasy world?

:holdthepain:

#banking #pin #memorization #security

2024-12-14

'Memorization With Neural Nets: Going Beyond the Worst Case', by Sjoerd Dirksen, Patrick Finke, Martin Genzel.

jmlr.org/papers/v25/23-1376.ht

#memorization #interpolation #interpolating

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst