#h100

Kevin Karhan :verified:kkarhan@infosec.space
2025-04-24
A H100 Purse with an nvidia H11 GPU inside offered for U$D 65.536,-
eicker.news ᳇ tech newstechnews@eicker.news
2025-04-22

»#Huawei readies new #AIchip for mass shipment: It achieves performance comparable to #Nvidia's #H100 chip by combining two #910B processors through advanced integration techniques.« reuters.com/world/china/huawei #tech #media #news

Benjamin Carr, Ph.D. 👨🏻‍💻🧬BenjaminHCCarr@hachyderm.io
2025-02-26

Sizing up #MI300A’s #GPU
It’s well ahead of #Nvidia’s #H100 PCIe for just about every major category of 32- or 64-bit operations. MI300A can achieve 113.2 TFLOPS of #FP32 throughput, with each FMA counting as two floating point operations. For comparison, H100 PCIe achieved 49.3 TFLOPS in same test.
#AMD cut down #MI300X’s GPU to create MI300A. 24 #Zen4 cores is a lot of #CPU power, and occupies one quadrant on the MI300 chip. But MI300’s main attraction is still the GPU.
chipsandcheese.com/p/sizing-up

2025-02-23

The 4x #Nvidia #H100 SXM5 server in the new Festus cluster at Uni Bayreuth is the fastest system I've ever tested in #FluidX3D #CFD, achieving 78 GLUPs/s #LBM performance at ~1650W #GPU power draw. 🖖😋🖥️🔥
github.com/ProjectPhysX/FluidX
hpc.uni-bayreuth.de/clusters/f

FluidX3D multi-GPU benchmark bar chartFluidX3D benchmark running on 4x H100 SXM5 GPUsGPU load during FluidX3D benchmark shown in nvidia-smiCPU/GPU load during FluidX3D benchmark shown in my own monitoring application
st1nger :unverified: 🏴‍☠️ :linux: :freebsd:st1nger@infosec.exchange
2025-02-05

#Huawei #HiSilicon #Ascend 910C is a version of the company's Ascend 910 processor for #AI training introduced in 2019. By now, the performance of the Ascend 910 is barely sufficient for the cost-efficient training of large AI models. Still, when it comes to inference, it delivers 60% of #Nvidia #H100 performance, according to researchers from #DeepSeek While the Ascend 910C is not a performance champion, it can succeed in reducing China's reliance on Nvidia #GPU's tomshardware.com/tech-industry

Okay, loosing my mind here a bit. I just tested #OpenGL rendering under Linux on an #NVIDIA #H100 GPU, through #VirtualGL's #EGL backend.

And it worked... Renderer "NVIDIA H100/PCIe/SSE2", driver 555.42.06

I always understood the H100s to be incapable of OpenGL. But it seems I missed a crucial part in the H100 architecture doc (resources.nvidia.com/en-us-ten), shown in the image.

Except, I'm sure I tested OpenGL at some point under X, but it didn't work. So, did anything change (e.g. driver)?

DeepSeek 測試:華為昇騰 910C 效能達 H100 六成 盼減低依賴 NVIDIA
DeepSeek 研究團隊的測試顯示,華為最新 AI 處理器昇騰 910C(Ascend 910C) 在推理運 […]
The post DeepSeek 測試:華為昇騰 910C 效能達 H100 六成 盼減低依賴 NVIDIA appeared first on 香港 unwire.hk 玩生活.樂科技.
#人工智能 #科技新聞 #AI #H100
unwire.hk/2025/02/05/huawei-91

Frankie ✅Some_Emo_Chick
2025-02-01
eicker.news ᳇ tech newstechnews@eicker.news
2025-01-31

»#DeepSeek Debates: Chinese Leadership On #Cost, True #TrainingCost, Closed Model Margin Impacts #H100 Pricing Soaring, Subsidized Inference Pricing, #ExportControls, MLA.« semianalysis.com/2025/01/31/de #tech #media

2025-01-28

@PWS_1
Zur Zeit weiss noch niemand zu sagen, was hinter der sogenannten Kosteneffizienz von #DeepSeek steckt.
Bessere #Algorithmen? Adequatere chinesisch geframte #Trainingsdaten? Weit mehr Trainingskapazität qua mehr chinesischen Sklavenarbeitern (vgl. reverse lookup auf analogen Telefon&Adressdaten qua Transkriptionssklaven)? Oder hat #China tatsächlich doch Zugang zu genügend stromfressenden #H100-Ressourcen?
Ja, die Benchmarktests zu DeepSeek sind remarkable. #AI
m.youtube.com/watch?v=FJvSFTMN

BuySellRam.comjimbsr
2024-12-28

Global AI Giants GPU Resources Revealed: Over 12.4 Million H100 Equivalents Projected by 2025!

lesswrong.com/posts/bdQhzQsHjN

2024-12-28

Global AI Giants GPU Resources Revealed: Over 12.4 Million H100 Equivalents Projected by 2025!

lesswrong.com/posts/bdQhzQsHjN

#AI #GPU #graphicscard #H100 #Tech

卡拉今天看了什麼ai_workspace@social.mikala.one
2024-12-24

MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive – SemiAnalysis

Link
📌 Summary: 本文深入比較了AMD的MI300X與Nvidia的H100和H200在訓練性能、用戶體驗和總擁有成本等方面的優劣。儘管MI300X在規格上似乎優於競爭對手,實際性能卻未達預期,主要原因是AMD的公共軟件堆棧存在多重漏洞,導致用戶的初始體驗不佳。AMD必須改進軟件質量和測試過程,並提供更良好的出廠體驗纔能有效競爭。此文亦提供了對AMD的具體建議,助其在AI訓練工作負載中成為更強的競爭者。

🎯 Key Points:
- 性能比較:MI300X在矩陣乘法(GEMM)性能上普遍低於H100/H200。
- 用戶體驗:MI300X的公共穩定版本在出廠時存在多數Bug,影響用戶的使用體驗。
- 總擁有成本(TCO):雖然MI300X的TCO較低,但在公開穩定版本上,其訓練性能表現卻不佳。
- 建議改進:AMD需提高軟件開發資源、改善自家開發流程,並加強自動化測試來提升產品質量。
- 軟件支持:AMD應提交MLPerf訓練結果,以提升其市場競爭力和透明度。

🔖 Keywords: #MI300X #H100 #H200 #訓練性能 #用戶體驗

2024-11-26

Breakthrough in AI-Powered #Audio Generation and Transformation 🎵

🎹 #Fugatto, developed by #NVIDIA researchers, introduces universal sound manipulation through text prompts, handling music, voice & sound effects simultaneously

🎯 Advanced capabilities include accent modification, emotion control, and creation of never-before-heard sounds using #AI technology

🔧 Technical specs: 2.5B parameters, trained on #DGX systems with 32 #H100 GPUs, featuring ComposableART for instruction combination

🎨 Applications span #music production, game development, advertising & language learning - enables real-time audio asset generation & modification

💡 Developed by international team from India, Brazil, China, Jordan & South Korea, enhancing multi-accent & multilingual capabilities

#ai #ML

blogs.nvidia.com/blog/fugatto-

📺 youtu.be/qj1Sp8He6e4

2024-11-19

How To Install One Click, Pre-configured Hugging Face (HUGS) AI Models On DigitalOcean GPU Droplets youtu.be/-jwA9FrDLgc #Websplaining #HuggingFace #Hugs #HF #DigitalOcean #Droplet #GpuDroplets #GPU #OneClickAiModels #AI #AiModels #ML #LLM #LLMs #NVIDIA #TGI #Inference #H100

2024-11-19

How To Create A NVIDIA H100 GPU Cloud Server To Run And Train AI, ML, And LLMs Apps On DigitalOcean youtu.be/aDPUOzk443E #Websplaining #GPU #NVIDIA #DigitalOcean #GpuDroplet #Droplet #AI #ML #LLM #H100 #NvidiaH100 #H100GPU #CloudServer #VPS #Server #GpuServer #Ubuntu #Linux

HPC GuruHPC_Guru
2024-10-30

The converged memory architecture of can boost TTFT (time to first token) in multiturn user interactions by up to 2x on Llama 3 70B, compared to x86-based servers

developer.nvidia.com/blog/nvid

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst