#generalization

2025-06-17

'Random Pruning Over-parameterized Neural Networks Can Improve Generalization: A Training Dynamics Analysis', by Hongru Yang, Yingbin Liang, Xiaojie Guo, Lingfei Wu, Zhangyang Wang.

jmlr.org/papers/v26/23-0832.ht

#pruning #pruned #generalization

2025-06-11

Humans can apply solutions of past problems to new problems. @gershbrain @nicoschuck &co reveal the neural correlates of #generalization and show that humans apply past policies in a reward-sensitive manner that leads to high performance @PLOSBiology plos.io/3SJPMof

Experimental design. Participants competed a gem collector game while their brain activity was measured with functional magnetic resonance imaging. On each trial in the experiment, gems with distinct shapes could be resold for either a gain or a loss. Participants made a choice between four cities from around the world, each leading to a distinct collection of gems. To maximize profit overall, participants needed to choose the city best suited to the selling prices shown on each trial. Each block consisted of 32 training trials that included feedback (top), followed by a mixture of 16 training trials with feedback and 20 test trials without feedback (middle). Bottom: Brain regions of interest. The four regions include occipitotemporal cortex (OTC), the medial temporal lobe (MTL), orbitofrontal cortex (OFC) and dorsolateral prefrontal cortex (DLPFC). Regions were defined using FreeSurfer.
trndgtr.comtrndgtr
2025-05-27

AI Learns by Watching - Sholto & Trenton on Dwarkesh

Hacker Newsh4ckernews
2025-04-22
Games at Work dot bizgamesatwork_biz
2025-04-14

e509 — Maverick and Marbles

e509 with Michael and Michael - stories and discussion all around , , , generated , , and much more.

gamesatwork.biz/2025/04/14/e50

2025-04-14

e509 — Maverick and Marbles

e509 with Michael and Michael - stories and discussion all around #AI, #LLMs, #llamas, generated #Quake, #grokking, #generalization and much more.

gamesatwork.biz/2025/04/14/e50

2025-01-30

People value us for the value (they believe) we (might) add to them.

Generalizing of course, but it's all transactional. There's no (longer) valuing people for just who they are.

#society #people #life #generalization

Victoria Stuart 🇨🇦 🏳️‍⚧️persagen
2025-01-17

Grokking at Edge of Numerical Stability
arxiv.org/abs/2501.04697
old.reddit.com/r/MachineLearni
en.wikipedia.org/wiki/Grokking

* sudden generalization after prolonged overfitting
* massively overtrained NN can acq. "emergent"/supra performance/unexpected abilities
* unexp./accid. finding
* mechanisms starting to unravel

Grokked Transformers are Implicit Reasoners: Mechanistic Journey to Edge of Generalization
arxiv.org/abs/2405.15071
news.ycombinator.com/item?id=4

Grokking at Edge of Numerical Stability
https://arxiv.org/abs/2501.04697
https://old.reddit.com/r/MachineLearning/comments/1i34keg/grokking_at_the_edge_of_numerical_stability
https://en.wikipedia.org/wiki/Grokking_(machine_learning)

* sudden generalization after prolonged overfitting
* massively overtrained NN can acq. "emergent"/supra preformance: eerie/unexpected capabilities
* unexp./accid. finding
* mechanisms starting to be understood

Grokked Transformers are Implicit Reasoners: Mechanistic Journey to Edge of Generalization
https://arxiv.org/abs/2405.15071

#LLM #ML #grokking #NN #emergence #generalization
2024-12-20

A post from August 2024 by @grimalkina, boosted by someone on another instance, about why to report demographics in research even when you're not studying those groups. This seems like a great primer for people who have little background in basic #sampling and #generalization (for some reason I can't link/boost from here, so):

mastodon.social/@grimalkina/11

My 2 cents (already at least partially covered by Dr. Hicks):

1. Your study is never just about your study. Good science is #open and reusable. e.g., maybe your study on tech-enabled healthcare access isn't specifically about LGBTQ+ or Hispanic people, but what are you doing to help a researcher who comes along in 10 years? That information will change what they find and report.

2. Marginalized groups are often minorities, meaning representative probability samples (or --uncomfortable gesture-- convenience samples) for bread-and-butter research frequently have subpopulations too small for reasonable power in correlations, group differences, etc. That's just reality. It's also a big problem for our understanding of #marginalized + #minority groups. Oversampling or targeted studies of those groups are important. It's also important to have a large number of less-targeted studies with relevant information that can be synthesized later (see #1): one study with 1.3% trans participants doesn't tell us much about the trans population, but 20 studies, each of which has 1.3% trans participants, could tell us meaningful things.

3. Representation is important. My belief is that #marginalized+minoritized people need their identities and existence public and constant. In #science, both they and other people consuming the research will benefit from being reminded that they are there, almost always, in our #research.

2024-12-06

'Generalization on the Unseen, Logic Reasoning and Degree Curriculum', by Emmanuel Abbe, Samy Bengio, Aryo Lotfi, Kevin Rizk.

jmlr.org/papers/v25/24-0220.ht

#sparse #learns #generalization

2024-12-03

'Mentored Learning: Improving Generalization and Convergence of Student Learner', by Xiaofeng Cao, Yaming Guo, Heng Tao Shen, Ivor W. Tsang, James T. Kwok.

jmlr.org/papers/v25/23-1213.ht

#learners #learner #generalization

2024-12-01

'Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK', by Hongru Yang, Ziyu Jiang, Ruizhe Zhang, Yingbin Liang, Zhangyang Wang.

jmlr.org/papers/v25/23-0831.ht

#sparse #gradient #generalization

Jan Vlugjanvlug
2024-10-24

@schizanon @strebski @fossdd I think and are important factors for war and killing. I try to treat living beings as .

2024-09-01

Могут ли трансформеры «думать»

Недавние исследования показывают, что модели трансформеров способны почти безошибочно решать задачи, требующие нескольких логических шагов. Например, из утверждения А вывести Б и дойти логически до В. И что удивительно, это достигается без использования Chain-of-Thought или особых промптов — только классический GPT-2. Давайте посмотрим, как трансформеры «думают» при решении задач рассуждения, и напишем для этого код с использованием библиотеки Hugging Face.

habr.com/ru/articles/840136/

#GPT #грокинг #память_ИИ #задачи_рассуждения #общий_искусственный_интеллект #обобщение #generalization #трансформатор #память_трансформеров

2024-07-29

#8
The benefits of #Multitask studies are huge!

Most importantly, they allow testing the prevalent assumption of #generalization, yielding results with high chance of generalizing beyond the lab. What's more, they even enable the discovery of *new concepts*!

2024-07-14

'Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance', by Lisha Chen, Heshan Fernando, Yiming Ying, Tianyi Chen.

jmlr.org/papers/v25/23-1287.ht

#objectives #objective #generalization

Ralph Straumann (@rastrau)rastrau@swiss.social
2024-07-10

@markstos Impressive work. Connectivity, to me, implies network / topological metrics. I’ve experimented a bit with betweenness centrality (en.wikipedia.org/wiki/Betweenn) in Python and found it promising (also, e.g., for #network #generalization). However, it’s computationally expensive. #gis

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst