Lmst

#JackDongarra Makes a Stand for Traditional #HPC: "US still doesn’t have a clear, long-term plan for what comes next.... U.S. risks falling behind."

Challenges to high-performance computing threaten #US #innovation

The #AI boom has led chip makers to focus on #FP16 and #FP8, not the #FP64 used by scientific research. If chip companies stop making the parts that #scientists need, then it could become harder to do important research.
https://theconversation.com/challenges-to-high-performance-computing-threaten-us-innovation-255188

Meet DeepSeek-V3 — the 671 billion parameter beast that’s making OpenAI and Anthropic nervous 👀

👀

🧠 It’s:
✔ Faster
✔ Cheaper ($5.6M training vs $60M+)
✔ More accurate on key tasks like coding, math, and comprehension
✔ Open-source + MIT licensed
✔ Deployable across NVIDIA, AMD & Huawei

📊 Performance Highlights:
🔹 MMLU: 88.5%
🔹 HumanEval: 82.6%
🔹 DROP: 91.6
🔹 MATH-500: 90.2%
🔹 Chinese C-Eval: 86.5%

But wait... ⚠️

🚨 Your data goes to Chinese servers.
🚨 It dodges politically sensitive questions.
🚨 It’s already being banned by gov agencies for “privacy risks.”

So is it the best LLM of 2025 or a privacy nightmare?

📥 Read the full analysis report here → https://deepseekagi.org/deepseek-v3-architecture/

💬 Drop your thoughts in the comments 👇
#DeepSeekV3 #AIRevolution #GPT4 #Claude3 #OpenSourceAI #AIComparison #MoE #FP8 #FutureTech #FacebookAI #LLMBattle

Triple bird 🐦‍⬛
#birds #vsco #googlepixel #fp8 #fujipro800z

🧐 Welcome to the thrilling world of "#DeepSeek," where they unleash their groundbreaking #FP8 #GEMM #Kernels, as if these buzzwords mean anything to normal humans. 🤖✨ Now you too can revel in the #excitement of "#fine-grained #scaling," because who doesn't dream of spending their weekends scaling kernels? 🎉 #GitHub's #navigation menu is undoubtedly the real star here, stealing the show with its riveting toggle action. 🚀
https://github.com/deepseek-ai/DeepGEMM #tech #HackerNews #ngated

DeepSeek Open Sources DeepGEMM: Clean and efficient FP8 GEMM kernels — https://github.com/deepseek-ai/DeepGEMM
#HackerNews #DeepSeek #DeepGEMM #FP8 #AI #Kernels #OpenSource

FP32, FP16, BF16 и FP8 — разбираемся в основных типах чисел с плавающей запятой

Привет, Хабр! Сегодня давайте поговорим о том, как современные вычисления на GPU стали более гибкими и эффективными благодаря различным форматам чисел с плавающей запятой ( FP64 , FP32 , FP16 , BFLOAT16 и FP8 ). Эти форматы не просто числа — за каждым из них стоит конкретная область применения. В разных ситуациях мы сталкиваемся с задачами, где важны либо скорость, либо точность, и правильно выбранный тип floating point помогает оптимизировать ресурсы. Давайте разберём всё это на примерах и поймём, в каких задачах каждый из этих форматов будет наиболее полезен.

https://habr.com/ru/companies/serverflow/articles/847068/

#FP16 #fp32 #FP64 #BF16 #floating_point #плавающая_запятая #fp8 #числа_с_плавающей_запятой #формат_с_плавающей_запятой

Introducing Phind-405B and faster, high quality #AI answers for everyone

🚀 Phind-405B: New flagship #llm, based on Meta Llama 3.1 405B, designed for programming & technical tasks. #Phind405B

⚡ 128K tokens, 32K context window at launch, 92% on HumanEval, great for web app design. #Programming #AIModel

💡 Trained on 256 H100 GPUs with FP8 mixed precision, 40% memory reduction. #DeepSpeed #FP8

⚡ Phind Instant Model: Super fast, 350 tokens/sec, based on Meta Llama 3.1 8B. #PhindInstant

🚀 Runs on NVIDIA TensorRT-LLM with flash decoding, fused CUDA kernels. #NVIDIA #GPUs

🔍 Faster Search: Prefetches results, saves up to 800ms latency, better embeddings. #FastSearch

👨‍💻 Goal: Help developers experiment faster, new features coming soon! #DevTools #Innovation

https://www.phind.com/blog/introducing-phind-405b-and-better-faster-searches

Intel Gaudi — гонка ИИ-ускорителей

Привет Хабр! С вами снова ServerFlow и мы хотим поговорить о насущном – о ИИ с нейросетями, а точнее о железе на котором нейросети обучают и на котором впоследствии они работают. В последние годы эта индустрия напоминает арену бойцовского клуба, где технологические гиганты с ожесточенной конкуренцией стремятся предложить наиболее производительные и эффективные решения для машинного обучения. И хотя не особо похоже, чтобы у кого-то на этой арене получилось сместить лидера рынка в лице NVIDIA, однако, попытки продолжают предприниматься. Так продолжает и Intel, представив свету свою серию ИИ-ускорителей под брендом Gaudi, а не так давно и обновленную модель Gaudi 3. Ранее Intel предпринимала попытки в собственные разработки ИИ ускорителей, но в этот раз за работу взялась компания Habana Labs, приобретённая Intel в 2019 году за внушительную сумму в 2 миллиарда долларов.

https://habr.com/ru/companies/serverflow/articles/839090/

#npu #Intel #Gaudi #nvidia #h100 #ии #нейросети #gpu #b200 #FP8

Glad to be on here! My #introduction:

I'm an AI researcher in the UK, working at Graphcore - a semiconductor company who develop the #IPU (a #GPU alternative) 💻 I joined last year, having previously been at Oxford for my MSc.

My interests are in #numerics (especially #fp8 8️⃣), #LLMs, mixture-of-expert models, and anything to do with #solitaire ♣️ ♦️

Thanks to @thegradient for making this happen 😃

#FP8

Client Info