#InContextLearning

2025-12-15

zai-org giới thiệu **SCAIL** - mô hình AI tạo hoạt ảnh nhân vật chất lượng chuyên nghiệp bằng học tập ngữ cảnh. SCAIL tích hợp biểu diễn tư thế 3D-Consistent và kỹ thuật "full-context pose injection" để học chuyển động chi tiết, chính xác. Công nghệ này giúp tối ưu hóa quan hệ không gian-thời gian trong video. Khám phá tại blog/Huggingface/Github.

#AIAnimation #Technology #MôHìnhAI #HoạtNhânNhânVật #3DAnimation #InContextLearning #TríTuệNhiệtNhânTạo

reddit.com/r/LocalLLaMA/commen

Tero Keski-Valkamatero@rukii.net
2025-07-31

In-context learning has been consistently shown to exceed hand-crafted neural learning algorithms across the board.

But it's limited by the length of the context. Even with neural architectures allowing context to grow to infinity, these come with high costs and scaling problems.

Is there a way to incorporate new knowledge learned in-context back into neural network weights?

Of course there is!

Let's imagine we have a lot of data, sequences of instructions and outputs where in-context learning happens.

From this data we can produce a dataset of synthetic data which presents the new knowledge learned. We can continually train the model with this dataset.

Of course this is super slow and inconvenient. But as a result we'll get a dataset with in-context learning happening, and old model weights against new model weights.

We can use this data to train a neural programmer model directly!

That model would take in the context as such, and if in-context learning has happened in those interactions, it can predict the changes to the neural network weights which would happen if the long and heavy synthetic data pipeline had been run.

Instead of the heavy pipeline, we can just use the neural programmer model to directly update the large model weights based on the in-context learning it experienced, to crystallize the learnings into its long-term memory, not unlike what hippocampus does in the human brain.

#AI #LLMs #NeuralNetworks #InContextLearning

Tero Keski-Valkamatero@rukii.net
2023-03-06

Through scaling #DeepNeuralNetworks we have found in two different domains, #ReinforcementLearning and #LanguageModels, that these models learn to learn (#MetaLearning).

They spontaneously learn internal models with memory and learning capability which are able to exhibit #InContextLearning much faster and much more effectively than any of our standard #backpropagation based deep neural networks can.

These rather alien #LearningModels embedded inside the deep learning models are emulated by #neuron layers, but aren't necessarily deep learning models themselves.

I believe it is possible to extract these internal models which have learned to learn, out of the scaled up #DeepLearning #substrate they run on, and run them natively and directly on #hardware.

This allows those much more efficient learning models to be used either as #LearningAgents themselves, or as a further substrate for further meta-learning.

I have an #embodiment #research on-going but with a related goal and focus specifically in extracting (or distilling) the models out of the meta-models here:
github.com/keskival/embodied-e

It is of course an open research problem how to do this, but I have a lot of ideas!

If you're inspired by this, or if you think the same, let's chat!

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst