#ParallelComputing

2025-12-11

GPU là cốt lõi cho huấn luyện mô hình ngôn ngữ nhờ xử lý song song và tính toán ma trận nhanh. Bài viết phân tích kiến trúc GPU, phân biệt vs CPU, vai trò của CUDA/Tensor Cores, và quản lý VRAM. Hiệu suất GPU được đo lường bằng FLOPS, quyết định tốc độ huấn luyện. #AI #ML #GPU #MôHìnhNgônNgữ #CôngNghệ #ParallelComputing #DeepLearning #CUDA #VRAM #FLOPS #HiểuGPU #MachineLearningVietNam

reddit.com/r/LocalLLaMA/commen

2025-12-07

Unlock GPU acceleration with NVIDIA's cuTile, revolutionizing parallel kernel development

NVIDIA's cuTile is a groundbreaking programming model designed to simplify the development of parallel kernels for NVIDIA GPUs, enabling developers to harness the full potential of GPU acceleration. By leveraging cuTile, developers can create high-performance applications that efficiently utilize the massively...

Muhammed Shafin Phejhdiss
2025-11-23

Introducing Qeltrix (.qltx) – a PoC for content-derived, parallel, streaming encryption & obfuscation.

✅ Content-derived keys (full file or first N bytes)
✅ Parallel LZ4 compression + multi-core
✅ Deterministic byte permutation + ChaCha20-Poly1305 AEAD
✅ Memory-efficient, streaming read/write

Open-source & community-driven:
Dev.to: dev.to/hejhdiss/introducing-qe
GitHub: github.com/hejhdiss/qeltrix

Alessandra Bilardibilardi
2025-11-08

Questa settimana ho fatto la-due-giorni-a-Bologna 🚀🚀

Intensa di ispiranti, bellissimi … e le persone hanno reso tutto davvero indimenticabile. ✨

Conosco i retroscena dell’organizzazione e i ragazzi del @grusp hanno resto tutto perfetto, leggero e spensierato .. sebbene non fosse per nulla facile 💪

E’ sempre un piacere essere accettati come ai loro eventi ❤️

Alla prossima !

Alessandra Bilardi comincia il talk davanti alla platea di ~170 personeSerena legge una domanda del pubblicoUn selfie con tutti gli speakerTutto il bottino .. in particolare, la maglietta del Grusp ..
2025-10-30

Efficient GPU algorithm converts Bézier paths into renderable geometry, enabling real-time, cross-platform vector graphics rendering. hackernoon.com/implementing-da #parallelcomputing

2025-10-30

Efficiently convert cubic Bézier curves to Euler spirals for smoother GPU rendering and accurate parallel curve computations. hackernoon.com/how-to-convert- #parallelcomputing

2025-10-29

Today I introduced a much-needed feature to #GPUSPH.

Our code supports multi-GPU and even multi-node, so in general if you have a large simulation you'll want to distribute it over all your GPUs using our internal support for it.

However, in some cases, you need to run a battery of simulations and your problem size isn't large enough to justify the use of more than a couple of GPUs for each simulation.

In this case, rather than running the simulations in your set serially (one after the other) using all GPUs for each, you'll want to run them in parallel, potentially even each on a single GPUs.

The idea is to find the next avaialble (set of) GPU(s) and launch a simulation on them while there are still available sets, then wait until a “slot” frees up and start the new one(s) as slots get freed.

Until now, we've been doing this manually by partitioning the set of simulations to do and start them in different shells.

There is actually a very powerful tool to achieve this on the command, line, GNU Parallel. As with all powerful tools, however, this is somewhat cumbersome to configure to get the intended result. And after Doing It Right™ one must remember the invocation magic …

So today I found some time to write a wrapper around GNU Parallel that basically (1) enumerates the available GPUs and (2) appends the appropriate --device command-line option to the invocation of GPUSPH, based on the slot number.

#GPGPU #ParallelComputing #DistributedComputing #GNUParallel

2025-10-28

We are excited to return to Supercomputing! Join us on Sunday, November 16th for the OpenMP tutorial, Mastering OpenMP Tasking. This tutorial will provide performance and scalability recipes to improve the performance of OpenMP tasking applications.

Learn more about all of OpenMP's activities at #SC25 at: openmp.org/events/sc25/
#OpenMP #Tasking #parallelcomputing #hpc #multiprocessor

Join us at Supercomputing for the tutorial, Mastering OpenMP Tasking
2025-10-17

Join us at Supercomputing 2025 in St. Louis!

We have a packed agenda at this year's show with BOFs and tutorials, and be sure to join us in booth #911 to meet with OpenMP experts to ask your toughest questions, enter the daily Book Drawing, get your free OpenMP API 6.0 reference guide, and have an afternoon beverage.

Learn more: openmp.org/events/sc25/
#SC25 #OpenMP #parallelcomputing #hpc #gpu #pyomp

Join us at SC25! We will be in booth 911

Wir freuen uns, Euch auch in diesem Jahr wieder spannende MATLAB-Kurse im Online-Format in der GWDG Academy anzubieten, welche von MathWorks-Mitarbeitern durchgeführt werden:

💠 Parallel Computing with MATLAB
Termin: 17.11.2025, 10:00 – 13:00 Uhr
💠 Demo Session: Scaling up MATLAB to the GWDG Scientific Compute Cluster
Termin: 19.11.2025, 15:00 – 16:30 Uhr
💠 Introduction to Research Software Development with MATLAB
Termin: 20.11.2025, 09:00 – 12:00 Uhr
💠 Connecting MATLAB with Python and other Open Source Tools
Termin: 20.11.2025 14:00 – 17:00 Uhr

Die Kurstermine werden ergänzt um eine sogenannte Office Hour (online) am 21.11.2025, 14:00 – 15:00 Uhr, während der Fragen zu den vorgestellten Themen der Kurse ausgiebig gestellt und behandelt werden können, um einen Austausch zwischen den Teilnehmer*innen und den Dozenten zu erreichen.

🔗 s.gwdg.de/NRjJYK

#gwdg #academy #gwdgacademy #kurs #matlab #parallelcomputing #göttingen #unigöttingen #mathswork

2025-08-04

📢 OpenMP Newsletter – July 2025 Edition

Highlights:

🗓️ IWOMP 2025 preliminary program
👥 3 new members join the OpenMP Architecture Review Board
🛠️ OpenMP support in:

* GCC 15.1

* Intel oneAPI HPC Toolkit 2025.2

* NumPy 2.3

Full newsletter: mailchi.mp/e82391a1d7b0/thanks

🔗 openmp.org

#OpenMP #HPC #IWOMP2025 #ParallelComputing #NumPy #GCC #InteloneAPI

Rowan the Selfsamerosylf@c.im
2025-06-21

Link: mediatum.ub.tum.de/?id=601795 (It took digging to find this from the Wikipedia article [1] and the unsecured HTTP homepage for "BMDFM".)

```bibtex
@phdthesis{dissertation,
author = {Pochayevets, Oleksandr},
title = {BMDFM: A Hybrid Dataflow Runtime Parallelization Environment for Shared Memory Multiprocessors},
year = {2006},
school = {Technische Universität München},
pages = {170},
language = {en},
abstract = {To complement existing compiler-optimization methods we propose a programming model and a runtime system called BMDFM (Binary Modular DataFlow Machine), a novel hybrid parallel environment for SMP (Shared Memory Symmetric Multiprocessors), that creates a data-dependence graph and exploits parallelism of user application programs at run time. This thesis describes the design and provides a detailed analysis of BMDFM, which uses a dataflow runtime engine instead of a plain fork-join runtime library, thus providing transparent dataflow semantics on the top virtual machine level. Our hybrid approach eliminates disadvantages of the parallelization at compile-time, the directive based paradigm and the dataflow computational model. BMDFM is portable and is already implemented on a set of available SMP platforms. The transparent dataflow paradigm does not require parallelization and synchronization directives. The BMDFM runtime system shields the end-users from these details.},
keywords = {Parallel computing;Shared memory multiprocessors;Dataflow;Automatic Parallelization},
note = {},
url = {mediatum.ub.tum.de/601795},
}
```

[1]: en.wikipedia.org/wiki/Binary_M

#SMP #Parallelization #Multithreading #DependenceGraph #RunTime #DataFlow #VirtualMachine #VM #ParallelComputing #SharedMemoryMultiprocessors #AutomaticParallelization #CrossPlatform #Virtualization #Configware #Transputer

2025-06-11

📸 Full house at the OpenMP BOF at #ISC25 — over 140 attendees joined us in Hamburg! 🎉

Our session "What to Expect from OpenMP API Version 6.0" covered:

✅ A dive into key features of OpenMP 6.0
✅ A preview of 6.1 and 7.0
✅ Updates from toolchain developers
✅ Lively Q&A to help shape future OpenMP directions

Thanks to everyone who contributed — your feedback is powering the future of parallel programming! 💡

#OpenMP #HPC #ISC2025 #OpenMP6 #ParallelComputing #Supercomputing

2025-06-10

We’re excited to welcome NextSilicon to the OpenMP Architecture Review Board! 🎉

Their Intelligent Compute Architecture blends adaptive computing with self-optimizing hardware/software and open frameworks like OpenMP. Together, we’re shaping a future of performant, portable, shared-memory parallelism. 💻🌐

Read the press release:
tinyurl.com/yksfbrah

#OpenMP #NextSilicon #HPC #OpenStandards #ParallelComputing

2025-06-06

Join us at #ISC25 for the tutorial “Advanced OpenMP: Performance and 6.0 Features” on Friday, June 13, 9:00–13:00 CEST in Hall Y12, 2nd Floor, Hamburg Congress Center.

Learn how to boost OpenMP code performance on NUMA systems and accelerators, and get hands-on insights into vectorization, data locality, and the latest features in OpenMP 6.0.

Ideal for developers who want to go beyond the basics!

#HPC #OpenMP #ISC2025 #ParallelComputing

2025-05-27

Just published the post "Parallel and distributed computing in GNU Health." :gnu: 🏥
meanmicio.org/2025/05/27/paral
#ParallelComputing #GNUHealth #Tryton #OpenScience #GNU

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst