#InfiniBand

Gabriele Sveltogabrielesvelto@mas.to
2025-12-13

RE: mastodon.social/@h4ckernews/11

This is technically impressive. I didn't expect to see RDMA support on macOS, let alone running over Thunderbolt. They seem to provide a standard InfiniBand Verbs API but I couldn't find the sources of their driver (rdma_en5) and libraries yet. I guess they won't release the sources.

#RDMA #InfiniBand #macOS

Orhun Parmaksız 👾orhun@fosstodon.org
2025-12-09

Monitoring high-speed networks… in the terminal 😍

📡 **ibtop** — Real-time TUI monitor for InfiniBand networks.

💯 htop but for ultra-fast interconnects.

🦀 Written in Rust & built with @ratatui_rs

⭐ GitHub: github.com/JannikSt/ibtop

#rustlang #ratatui #tui #networking #infiniband #linux #terminal

2025-09-13

For today’s personal reminder that ‘Sucking at something is the first step towards being sorta good at something’ I’ve spend a few hours hammering away at the keyboard trying to make two Mellanox Connect-IB cards talk. I can safely say I know more than I did. I even installed Debian 10 just so I could flash one with the stock firmware. Every bit of this feels exotic and complicated. But $30 in and a few hours more and I may have IP over #Infiniband at close to 40gb/s.

2025-08-07

#HPC #supercomputing #Infiniband While trouble-shooting a performance issue on our NDR fabric, where nodes would randomly report high latency and less than expected bandwidth (up to 50% less), I discovered a setting within opensm.conf that configured routing to be randomized vs. distributed/round-robin.... Once I changed the setting (scatter_ports) to the _DEFAULT_, I had immediate and consistent performance improvements. See the before and after images... So, FYI, if your users are reporting random latency and bandwidth issues, double-check your opensm.conf routing. Also, I was using NVIDIA/Mellanox's clusterkit tool.

Five years after Intel spun off its #Omni-Path #interconnect tech into Cornelis Networks, its 400Gbps CN5000 line of switches and NICs is finally ready to do battle with its long-time rival, Nvidia's #InfiniBand www.theregister.com/2025/06/09/o... #HPC #AI via @theregister.com

Omni-Path is back to take on I...

2025-06-06

🚀 RoCE vs. InfiniBand: Asterfusion 800G Switch in AI Networking
In the realm of AI data centers, the debate between RoCE and InfiniBand is intensifying. Recent evaluations of Asterfusion's 800G RoCE switches reveal:

Lower P90 Inference Latency: Enhanced performance in AI inference tasks.

Higher Token Generation Rate: Improved throughput for large-scale AI models.

Cost Efficiency: Comparable performance to InfiniBand at a reduced cost.

These findings suggest that RoCE, especially when implemented on Asterfusion's 800G switches, offers a compelling alternative for AI networking infrastructure.

📖 Dive deeper into the analysis:
🔗 RoCE Beats InfiniBand? Asterfusion 800G Switch for AI Networking

#RoCE #InfiniBand #AIInfrastructure #800GSwitch #Asterfusion #DataCenterNetworking #HighPerformanceComputing #CloudSwith

2025-05-27

What does a #Mellanox aka #Nvidia #Infiniband card do when no device is connected to it?
Why does it run hotter than the sun?

2025-03-31

@lucas3d I have thought about this as an alternative. Having a main 2.5gb Ethernet as the main haul, but use the Infiniband between nodes for the internode back haul/main machine side channel to upload ISOs and other large scale data moves to the cluster from my main machine.

My ProxMox cluster is Lenovo Tinys, so I'd also need to upgrade to ones that supported PCIe to use a card in them, but that's already on the roadmap. It'd be nice if they made a Tiny with an Infiniband so we could use them as poor man blades or something.

So maybe the answer is keep three of the dual port Infiniband cards, sell the rest off to fund a migration to a 10gb Ethernet network. The three cards I keep will be for future migration of the nodes.

Does Infiniband support more than three nodes in a round robin? All the articles I've read have had three nodes and I don't know if that is coincidence or if that's an Infiniband limitation. I don't know enough about Infiniband to know.

#infiniband
#homelab
#networking

2025-03-31

My hardlined network is currently all 1GB ethernet connections. I have a few machines that have 2.5gb Ethernet ability and am debating going the cheap route and just getting a 2.5gb switch.

I also have 2 Sun Infiniband 40gb switches and a ton of Infiniband PCIe cards, but holy heck is rack mount network gear loud AF.

I'd love to move to 40gb but can't seem to find decent desktop switches for it. Is it only data centre grade equipment? Should I just sell it off and go 10gb Ethernet?

I know Infiniband offers a lot more than just basic networking, but only one or two of my machines would support those goodies.

#infiniband
#homelab
#networking

Gabriele Svelto [moved]gabrielesvelto@fosstodon.org
2025-03-19

Nvidia has been doing a lot of useless stuff lately, but this is actually a big deal. I wonder what the latency looks like on these switches. Traditionally direct-attach copper has always been the preferred choice for low-latency applications, with optics used for longer connections where latency matters less. I'm curious if this is going to change that.

techpowerup.com/334337/nvidia-

#Infiniband #Ethernet

2025-02-05

🚀 RoCE vs. InfiniBand: The Game-Changing Data Center Switch Test Results Revealed! ⚡

In AI and HPC networks, RoCE (RDMA over Converged Ethernet) and InfiniBand (IB) are often the go-to choices. Both offer low-latency, lossless transmission, but they come with key differences.

🔍 InfiniBand: A mature, low-latency protocol with specialized hardware, but higher TCO due to single-vendor reliance.

🔍 RoCEv2: More cost-effective, interoperable, and ideal for large-scale deployments like xAI’s AI cluster in Memphis!

Which one fits your needs? See the full comparison! 🔥

#RoCE #InfiniBand #AI #HPC #DataCenter #Networking #TechComparison #Ethernet #NetworkOptimization
cloudswit.ch/blogs/how-will-de

2024-12-06

Невероятная мощь NVIDIA GB200 NVL72: Внутри гиганта ИИ-вычислений

Привет, Хабр! Если вас всегда интересовало, как устроены по-настоящему производительные системы , вы попали по адресу. В сегодняшней статье мы расскажем, как Nvidia объединила сразу 72 ускорителя B200 в единый CUDA процессор GB200 NVL72 . Узнаем, как для создания эффективного интерконнекта используются технологии NVLink , Ethernet и Infiniband . Предметный разговор об аппаратной части уже ждет вас под кнопкой «Читать далее».

habr.com/ru/companies/serverfl

#сервер_флоу #GB200_NVL72 #nvlink #blackwell #nvidia_grace #Nvidia_Superchip #NVLink_Spine #infiniband #llm #SeverFlow

Mike PFenTiger
2024-09-06

Apparently if you push a wookie, you can expect to get a cookie in response.

I'm not sure I'll be trying this one myself.

Benjamin Carr, Ph.D. 👨🏻‍💻🧬BenjaminHCCarr@hachyderm.io
2024-07-31

What If #OmniPath Morphs Into The Best #UltraEthernet?
Many #HPC centers in #US – importantly #Sandia and #LawrenceLivermore as well as the Texas Advanced Computing Center (#TACC) – wanted an alternative to #InfiniBand or proprietary interconnects like #HPE/#Cray’s Slingshot, and they have been funding the redevelopment of Omni-Path. And now, #CornelisNetworks is going to be intersecting its roadmap with Omni-Path switches and adapters with the #UEC roadmap.
nextplatform.com/2024/06/26/wh

aijobs.net => foorilla.comaijobs@mstdn.social
2024-07-14

HIRING: Principal GPU Capacity and Resource Management Engineer / US, CA, Santa Clara
💰 USD 272K+

👉 ai-jobs.net/J326446/

#Ansible #ComputerScience #CUDA #DeepLearning #Engineering #GPU #HPC #InfiniBand #Kubernetes #Linux

2024-03-03

InfiniBand в Windows — это просто

К написанию этой небольшой инструкции меня привела статья на Хабре - Быстрая сеть в домашней лаборатории или как я связался с InfiniBand . Я был очень заинтригован данным вопросом, но каково было моё удивление, когда я не мог найти почти никакой информации по InfiniBand на Windows в домашних условиях, например, в домашней лаборатории или в небольшом офисе. Информация была, конечно. Было описано, как использовали InfiniBand, какое оборудование использовали и о полученной производительности сети. Всё точно как у товарища из упомянутой статьи выше. Но не было информации о том, как поднять домашнюю сеть на IB, как настроить её и вообще с чего начать. Проведя время в интернете, я пришёл к выводу, что большинство пользователей, даже тех, кто знаком с сетями и их настройкой, тупо боятся слова InfiniBand. Для них это что-то сложное, используемое мегакорпорациями для создания суперсетей для суперкомпьютеров. А сочетание слов "InfiniBand дома" приводит их в ужас. А если ещё и коммутатор неуправляемый... Ну, вы поняли. Из той немногой информации, что мне удалось найти, я вычленил и написал простую инструкцию для новичков по InfiniBand в домашних условиях: какое оборудование нужно, как установить драйвера, как настроить сеть IB между несколькими ПК и неуправляемым коммутатором. Итак, давайте начнем!

habr.com/ru/articles/797759/

#infiniband #40g #mellanox #коммутатор #сетевая_карта #высокая_производительность

Kevin Karhan :verified:kkarhan@infosec.space
2024-02-17

@melissabeartrix then consider #iSCSI or #FCoE (#FibreChannel over #Ethernet) over #OM5-fiber - based #100GBASE-SR1.2 as per 802.3bm-2015 Ethernet.

Just make shure your devices have #QSFP28 ports to chug' in the LC-Duplex connectors of the fibers amd support 64k Jumbo Frames...

Cuz unlike #InfiniBand that stuff is at least long-term useful and salvageable...

en.wikipedia.org/wiki/100_Giga
en.wikipedia.org/wiki/ISCSI
en.wikipedia.org/wiki/Fibre_Ch
en.wikipedia.org/wiki/Fibre_Ch
en.wikipedia.org/wiki/Fibre_Ch
en.wikipedia.org/wiki/ISCSI_Ex
en.wikipedia.org/wiki/InfiniBa

HPC GuruHPC_Guru
2023-10-10

Nvidia data center roadmap: refresh in 2024, transition to later in 2024 with another architecture in 2025

Nvidia now breaks out its Arm-based products and its x86-based products, with on top

On the side, both and are going to progress from 400Gbps to 800Gbps in 2024 and then to 1.6Tbps in 2025

Something missing from the roadmap is the NVSwitch/ NVLink roadmap

servethehome.com/nvidia-data-c

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst