Lmst

[2504.00927] Multi-Token Attention
https://arxiv.org/abs/2504.00927
https://news.ycombinator.com/item?id=43562384

#ML #TransformerArchitecture #NeuralNetworks #transformers #attention

Multi-Token Attention
https://arxiv.org/abs/2504.00927
FAIR at Meta

Step Up Your AI Knowledge – Learn Transformers Today!

#GPT #ArtificialIntelligence #MachineLearning #AITraining #DeepLearning #TransformerArchitecture #LearnAI #AIModels #AzureAI #AIEducation #SkillTech #FutureTech #OpenAI #NeuralNetworks #TechLearning

OpenAI 03 LLM: 87.5% High Score on ARC Prize Challenge
https://old.reddit.com/r/MachineLearning/comments/1hiq3tz/d_openai_o3_875_high_score_on_arc_prize_challenge
https://news.ycombinator.com/item?id=42473321

* GPT-3 scored 0%
* rare benchmark wh. humans get high scores, LLM low scores
* avg. human performance, ARC-AGI is 85%

OpenAI o3 Breakthrough High Score on ARC-AGI-Pub
https://arcprize.org/blog/oai-o3-pub-breakthrough
https://arcprize.org/arc-agi-pub

OpenAI o3 beats 99.8% competitive coders
https://old.reddit.com/r/MachineLearning/comments/1hiqptc/openais_o3_beats_998_competitive_coders_d

#LLM #OpenAI #OpenAI_o1 #OpenAI_o3 #GPT4o #ML #TransformerArchitecture #reasoning #COT #ChainOfThought #AGI #AI

OpenAI o3 87.5% High Score on ARC Prize Challenge
https://old.reddit.com/r/MachineLearning/comments/1hiq3tz/d_openai_o3_875_high_score_on_arc_prize_challenge/

* benchmark on which GPT-3 scoring 0%

OpenAI o3 Breakthrough High Score on ARC-AGI-Pub
https://arcprize.org/blog/oai-o3-pub-breakthrough
https://arcprize.org/arc-agi-pub

OpenAI's O3 beats 99.8% competitive coders
https://old.reddit.com/r/MachineLearning/comments/1hiqptc/openais_o3_beats_998_competitive_coders_d

#LLM #OpenAI #OpenAI_o1 #OpenAI_o3 #GPT4o #ML #TransformerArchitecture #reasoning #COT #ChainOfThought #AGI #AI

[thread] OpenAI o1, o3 | OpenAI GPT-4o
https://en.wikipedia.org/wiki/OpenAI_o1

* generative pre-trained transformer
* form. known within OpenAI as “Q*"
* o1 spends time "thinking" before it answers
* makes it better at complex reasoning tasks, science & programming than OpenAI GPT-4o
* full v. was released 2024-Dec-05

#LLM #OpenAI #OpenAI_o1 #OpenAI_o3 #GPT4o #ML #TransformerArchitecture #reasoning #COT #ChainOfThought #AGI #AI

Transformer Explainer: Interactive Learning of Text-Generative Models https://arxiv.org/abs/2408.04619v1 (neat visualization) #AI #transformerArchitecture

Transformer Explainer: Interac...

Transformer Explainer: Interactive Learning of Text-Generative Models https://arxiv.org/abs/2408.04619v1 (neat visualization) #AI #transformerArchitecture

I feel not enough people are talking about #StateSpaceArchitecture #MambaArchitecture and the benefits that can be provided over #TransformerArchitecture #TranformerLanguageModels
#AI #SSMs #LLM

Generative AI exists because of the transformer (free for non-subscribers) - Financial Times https://ig.ft.com/generative-ai/ (useful visual explanation) #AI #TransformerArchitecture

A jargon-free explanation of how AI large language models work – Ars Technica https://arstechnica-com.cdn.ampproject.org/c/s/arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/amp/ (I like this one) #AI #technology #LLMs #transformerArchitecture #embeddings

#TransformerArchitecture

Client Info