#BlueField

Marcel SIneM(S)USsimsus@social.tchncs.de
2024-11-10

Wusste gar nicht, dass Nvidia auch Netzwerkadapter herstellt...

#Nvidia #ConnectX, #BlueField: Angreifer können Daten manipulieren | Security heise.de/news/Nvidia-ConnectX- #Patchday #NvidiaConnectX

2024-10-30

Inside the World's Largest AI Supercluster xAI Colossus

---

Summary

🚀 #Largest AI Supercomputer: The #xAI #Colossus is built with over 100,000 GPUs, massive storage, and #High Speed Networking, designed for #AI projects beyond typical #Chatbot applications.

🛠️ #Record Breaking Construction: The facility, containing over 100,000 GPUs, was constructed in just 122 days—significantly faster than traditional #Supercomputers that often take years.

💧 #Advanced Liquid Cooling System: The #Data Halls are equipped with #State Of The Art liquid cooling, using separate pipes for hot water and cold water, which efficiently manages #Heat from the #GPU Servers.

📊 #Scalable GPU Racks: Each rack includes multiple #NVIDIA #HGX H100 units, optimized with #Compact and easily serviceable designs, featuring #Cooling Manifolds and advanced #GPU configurations.

🔌 #Innovative Power Management: #Tesla #Mega Packs support the power demands of the #AI Clusters by managing microsecond power fluctuations, stabilizing #Energy Delivery to the GPU units.

🌐 #Ethernet Driven Networking: Unlike most #Supercomputers, the cluster uses #Ethernet Networking with #NVIDIA #Bluefield 3 #DPUs and #Spectrum X switches, offering robust 400 Gbps connections for efficient data flow.

youtu.be/Jf8EPSBZU7Y?si=Gi1i66

Benjamin Carr, Ph.D. 👨🏻‍💻🧬BenjaminHCCarr@hachyderm.io
2024-08-03

Widescreen Wonder: #LasVegasSphere
54,000 m2 (~3.67 acre) interior LED display (16x16K) and an exterior LED display (‘Exosphere’) consisting out of 1.23 million LED ‘pucks’. Driving all these pixels are around 150 #NVidia RTX #A6000 #GPU, installed in computer systems which are networked using NVidia #BlueField data processing units (#DPU) and NVidia #ConnectX6 NICs (up to 400 Gb/s), with visual content transferred from Sphere Studios in Cali. All this hardware uses 45kW.
blogs.nvidia.com/blog/sphere-l

Las Vegas Sphere lit up at night
2024-06-24

@karppinen Mellanox/NVIDIA has been trying to shove into any customer box they can for years. They're even mandatory in some configurations (e.g. DGX) and there's no shortage of stock.

The "Self-Hosted DPU Controller" mode mentioned in the video has been officially supported with BSP 4.5.0 since December 2023, but customers like and us got access to that long before.

Probably Netflix is actually running this right now at 100 Watts, but we have no confirmation.

2024-06-24

@karppinen According to the video stream for that talk, this refered to a prototype that wasn't ready or in use at the moment he talked about it and consumed at least 125 watts when last measured:

m.youtube.com/watch?v=q4TZxj-D

Around 23:00

So while the idea is nothing new and it's quite possible (people have been running Yocto Linux with nginx and Offload directly on for a while), those slides do not prove "can now" do it.

2022-05-01

What better #introduction to #fosstodon than a look at what I think was my first #foss contribution. I’d like to apologize now for the unbounded memory allocation bug I introduced. Oops.

github.com/php/php-src/commit/

Since then I’ve done a bunch of #sysadmin work (mostly #Solaris) followed by OS (#solaris #illumos #smartos #zfs) development and more recently various storage stuff (#nvme #nvmeof #roce #spdk #bluefield #dpu #smb).

I love #hiking through the #wilderness and other natural areas.

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst