[2504.00927] Multi-Token Attention
https://arxiv.org/abs/2504.00927
https://news.ycombinator.com/item?id=43562384
#ML #TransformerArchitecture #NeuralNetworks #transformers #attention
[2504.00927] Multi-Token Attention
https://arxiv.org/abs/2504.00927
https://news.ycombinator.com/item?id=43562384
#ML #TransformerArchitecture #NeuralNetworks #transformers #attention
Step Up Your AI Knowledge – Learn Transformers Today!
#GPT #ArtificialIntelligence #MachineLearning #AITraining #DeepLearning #TransformerArchitecture #LearnAI #AIModels #AzureAI #AIEducation #SkillTech #FutureTech #OpenAI #NeuralNetworks #TechLearning
OpenAI 03 LLM: 87.5% High Score on ARC Prize Challenge
https://old.reddit.com/r/MachineLearning/comments/1hiq3tz/d_openai_o3_875_high_score_on_arc_prize_challenge
https://news.ycombinator.com/item?id=42473321
* GPT-3 scored 0%
* rare benchmark wh. humans get high scores, LLM low scores
* avg. human performance, ARC-AGI is 85%
OpenAI o3 Breakthrough High Score on ARC-AGI-Pub
https://arcprize.org/blog/oai-o3-pub-breakthrough
https://arcprize.org/arc-agi-pub
OpenAI o3 beats 99.8% competitive coders
https://old.reddit.com/r/MachineLearning/comments/1hiqptc/openais_o3_beats_998_competitive_coders_d
#LLM #OpenAI #OpenAI_o1 #OpenAI_o3 #GPT4o #ML #TransformerArchitecture #reasoning #COT #ChainOfThought #AGI #AI
[thread] OpenAI o1, o3 | OpenAI GPT-4o
https://en.wikipedia.org/wiki/OpenAI_o1
* generative pre-trained transformer
* form. known within OpenAI as “Q*"
* o1 spends time "thinking" before it answers
* makes it better at complex reasoning tasks, science & programming than OpenAI GPT-4o
* full v. was released 2024-Dec-05
#LLM #OpenAI #OpenAI_o1 #OpenAI_o3 #GPT4o #ML #TransformerArchitecture #reasoning #COT #ChainOfThought #AGI #AI
Transformer Explainer: Interactive Learning of Text-Generative Models https://arxiv.org/abs/2408.04619v1 (neat visualization) #AI #transformerArchitecture
Transformer Explainer: Interac...
Transformer Explainer: Interactive Learning of Text-Generative Models https://arxiv.org/abs/2408.04619v1 (neat visualization) #AI #transformerArchitecture
Generative AI exists because of the transformer (free for non-subscribers) - Financial Times https://ig.ft.com/generative-ai/ (useful visual explanation) #AI #TransformerArchitecture
A jargon-free explanation of how AI large language models work – Ars Technica https://arstechnica-com.cdn.ampproject.org/c/s/arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/amp/ (I like this one) #AI #technology #LLMs #transformerArchitecture #embeddings