Lmst

🌖 打造可爆發的虛擬機器：使用 cgroups 的 CPU 切片
➤ 利用 Linux cgroups v2 實現彈性且經濟的雲端運算
✤ https://www.ubicloud.com/blog/building-burstables-cpu-slicing-with-cgroups
Ubicloud 為了提供更具彈性的虛擬機器選擇，推出了「可爆發型虛擬機器」。這些虛擬機器共享 CPU 資源，並能在需求高峯時爆發到更高的 CPU 使用率。Ubicloud 利用 Linux 核心的 Control Groups v2 (cgroups v2) 功能來實現這一功能，透過控制 CPU 和記憶體資源，精細地管理虛擬機器之間的資源分配，並確保資源隔離。文章詳細介紹了 cgroups 的架構、設定方式 (透過虛擬檔案系統和 systemd)，以及 cpuset 和 cpu 控制器的應用，展示瞭如何利用這些工具打造高效且具彈性的雲端運算環境。
+ 這篇文章清楚地解釋了 cgroups 的概念和應用，對於想了解雲端運算基礎技術的人來說很有幫
#雲端運算 #虛擬化 #Linux #cgroups

🎉 Oh, rejoice! Another riveting deep dive into the thrilling world of #CPU #slicing with #cgroups 🤖, because obviously, your life was missing the soaring excitement of "Building Burstables". Meanwhile, #NewEuroGPT Enterprise promises to keep your data private, as if your browsing history is more interesting than the latest cat meme 🐱.
https://www.ubicloud.com/blog/building-burstables-cpu-slicing-with-cgroups #DataPrivacy #TechTrends #Burstables #HackerNews #ngated

Building Burstables: CPU slicing with cgroups

https://www.ubicloud.com/blog/building-burstables-cpu-slicing-with-cgroups

#HackerNews #BuildingBurstables #CPU #Slicing #cgroups #CloudComputing #TechInnovation

What if I told you, #Linux doesn’t just support multiple users — it supports multiple isolated worlds via #namespaces, #cgroups, and #containers.

@ShadowJonathan if it's only about simulating those conditions for a single process (and its sub-processes), cgroups might be worth a look.

If you don't want to fiddle around with the low-level details of them, wrap the execution of your process in "systemd-run --user --wait --pty ..." and use "--slice" to assign the process to the corresponding slice with the desired resource constraints.

https://www.freedesktop.org/software/systemd/man/latest/systemd-run.html

https://www.freedesktop.org/software/systemd/man/latest/systemd.resource-control.html

#systemd #cgroups

[Перевод] Как собрать Linux-контейнер с нуля и без Docker

Перевели для вас статью про то, как с нуля создать Linux-контейнер, аналогичный тому, который можно запустить с помощью Docker, но без использования Docker или других инструментов контейнеризации.

https://habr.com/ru/companies/flant/articles/880354/

#контейнеризация #контейнеры #containers #cgroups #namespaces #linux #linuxконтейнеры #docker #root #overlayfs

Устраняем эффект шумного соседа в PostgreSQL с помощью cgroups

Если вы когда-нибудь запускали несколько экземпляров PostgreSQL или другого ПО на одной машине (виртуальной или физической), то наверняка сталкивались с эффектом шумного соседа, когда инстансы мешали друг другу работать. Так как же примерить «соседей»? У нас есть эффективный способ.

https://habr.com/ru/companies/postgrespro/articles/878844/

#cgroups_v2 #cgroups #postgresql #linux #администрирование_linuxсистем #нагрузка

Мой первый контейнер без Docker

Технологии контейнеризации, возможно, как и у большинства из нас, плотно засели в моей голове. И казалось бы, просто пиши Dockerfile и не выпендривайся. Но всегда же хочется узнавать что‑то новое и углубляться в уже освоенные темы. По этой причине я решил разобраться в реализации контейнеров в ОС на базе ядра linux и в последствие создать свой «контейнер» через cmd.

https://habr.com/ru/articles/881428/

#docker #контейнер #контейнеризация #linux #namespace #cgroups #cgroup_v2

Today, the typical downwards spiral, from "just start up the devel VM":

$ vagrant up

Followed by: Why doesn't it start? 😕

Why libvirtd service won't restart? 😳

Finally ending, after a long search, at: Why /sys/fs/cgroups/ is empty??? 😱

Should I try to repopulate it, or just reboot? 🤔

Oh already that late? Reboot then.

#cgroups, #libvirt

@mwl @kta At some point the Linux community needs to accept cgroups is a utter dumpster fire.

Also that article seems to have utterly misunderstood what docker/cgroups/kubernetes got fucking wrong.

> 1. First off, we ditch the shared-kernel approach entirely. We need to build a micro-hypervisor model, where each container runs its own minimal kernel. This ensures that every container is genuinely isolated, similar to a lightweight VM but without the bloat. By employing a microkernel architecture, you’re essentially granting each container its own mini-OS that only loads essential components, drastically reducing the attack surface. This step eliminates the primary flaw of Docker’s shared-kernel model.

You mean proper OS level virtualization. This means accepting cgroups sucks and frankly ditching the not-invented-here attitude that permeates Linux dev stuff.

> 2. Next, leverage hardware-assisted virtualisation like Intel VT-x or AMD-V to handle isolation efficiently. This is where we’ll differentiate ourselves from Docker’s reliance on namespaces. With hardware support, each container will get near-native performance while maintaining strict separation. For example, instead of binding everything to a Linux kernel, containers will interact directly with hardware-level isolation, meaning exploits won’t have the chance to jump from one container to another.

You are now wanting stripped down VM. Nothing new needed to do this already. Libvirt+QEMU, Bhyve, VMware, etc will do this quiet happily.

> 3. We can’t ignore orchestration. Rather than bolting on security later, build an orchestration layer that enforces strict security policies from the get-go. This orchestration tool, think Kubernetes but with security baked in, will enforce seccomp, AppArmor, and SELinux profiles automatically based on container configurations. For instance, before launching a container, the orchestration layer could analyse its dependencies and generate a security profile dynamically, ensuring that each container only has access to the resources it needs.

For fuck sake... please don't. This is the issue with docker/kubernetes... it is a utter dumpster fire thanks to being a over complex nightmare that is a PITA when it comes to orchestration.

Also this sounds like they have never dealt much with seccomp or AppArmor. That shit is broken as fuck by default.

Better idea. Make it easy to control with like how it is with Jails on FreeBSD and start with actually sane defaults.

> 4. Let’s go beyond the crude root vs non-root distinction Docker offers. Implement a permission system that assigns containers fine-grained capabilities, like capabilities management in modern OSes. You’ll create an RBAC model that defines precisely what a container can or cannot access such as network resources, storage, specific hardware, etc. Imagine having a declarative YAML file that specifies, down to the syscalls, which capabilities each container is granted, ensuring it only gets what it genuinely needs to function.

Again you are wanting OS level virtualization.

You are again trying to re-invent the wheel instead of looking at working examples of something like this.

> 5. Containers shouldn’t be changing their state once they’re up and running. We must enforce immutable infrastructure, meaning containers are rebuilt from scratch for every update rather than being patched live. This prevents attackers from persisting inside a compromised container. Think of this as Docker’s “build once, deploy everywhere” mantra. It never truly worked for Java (also a technology that Linus absolutely hates), but it might just work for a containeriser. Changes require redeployment, not modification, thus ensuring that every running instance is identical to the tested version.

Drek like this should be a end user choice. There are lots of reasons for both. Again the base tooling for this should give zero fucks.

Want to make something like this awesome? Start with a sane base design such as Jails, which provides a nice system to do what ever you want with and allows you to keep a existing one with patching live or rebuild... both just as easily.

> 6. Build in real-time vulnerability scanning and automated patching. Containers should be scanned continuously, not just at build time. If a vulnerability is found, the system will either patch it in the background or alert you to rebuild the affected containers. This means integrating tools like Clair or Trivy directly into the platform, ensuring that no container runs outdated or vulnerable code.

This is something totally unrelated to what is being talked about here. The scope of complexity here means it by no means should be part of the same tooling.

#FreeBSD #jails #linux #cgroups #Docker #Kubernetes

I'm using #ansible to automate #helm which automates #kubectl which automates #crio which automates #cgroups and #namespaces. #kubernetes is #turtlesallthewaydown

@boudah @robpumphrey

That says that it moved to https://github.com/opsengine/cpulimit 12 years ago.

That said, if portability is not a concern, adjusting the max bandwidth of the cpu controllers of a dedicated control group for Chrome processes is the better approach.

#cpulimit #linux #cgroups

Если надо ограничить аппетит какой-то программы в плане #CPU или же по памяти, то в случае #linux для этого прекрасно подходит #cgroups. Вот только в системах на #systemd нет смысла:

руками создавать в /sys/fs/cgroup/
пользоваться #libcgroup (cgreate/cgset/cgexec).

У некоторых дистрибутивов это вообще deprecated — у RHEL и у SLES.

Легко и просто
Вся операция умещается в запуск через:
systemd-run --scope -p CPUQuota=15% /usr/bin/binaryname
Или же чуть более продвинуто:
systemd-run -u username -p CPUQuota=50% -p MemoryMax=100M /usr/bin/binaryname
——————————————————————————————————

А если руками?
Это когда:

# mkdir /sys/fs/cgroup/lalala
# echo "50000 100000" > /sys/fs/cgroup/lalala/cpu.max
# echo "100M" > /sys/fs/cgroup/lalala/memory.max
# echo $$ >> /sys/fs/cgroup/lalala/cgroup.procs
# /usr/bin/binaryname
# rmdir /sys/fs/cgroup/lalala

При этом ещё и помнить, что если два ядра на #ЦПУ, то возможно и «200000 100000», а потому echo "50000 100000" будет давать 25%, а не 50% как на компе с одним ядром.
——————————————————————————————————

А используя #libcgroup ?
Это когда:

# cgexec -g cpu,memory:tatata /usr/bin/binaryname
# cgdelete -g cpu,memory:/tatata

Или же долго-нудно, почти в ручном режиме:

# cgcreate -g cpu,memory:/tatata
# cgset -r cpu.max="50000 100000" tatata
# cgset -r memory.max="100M" tatata

Далее, два варианта действий, как и при «ручном»:

текущий shell перевести в tatata и выполнить из под него binaryname:
```
# echo $$ >> /sys/fs/cgroup/tatata/cgroup.procs
# /usr/bin/binaryname
```
запустив процесс binaryname и переместить его по PID в tatata:
```
# BIN_PID=$(pgrep -xo hog)
# cgclassify -g cpu,memory:tatata ${BIN_PID}
```

В любом случае не забывая удалить:
# cgdelete -g cpu,memory:/tatata
——————————————————————————————————

На фоне всего этого использование systemd-run не только соблюдает принцип одного менеджера файловой системы #cgroupfs в системе, но и ощутимо проще для пользователя.

Кстати, ограничение игр проще делать через «Frame limit» — задав максимальное количество кадров в секунду. Например, поставив 60fps или 90fps, 120fps, в зависимости от того, с какой частотой монитор работает. Перестают жрать ЦПУ высчитывая несколько тысяч fps.

#lang_ru

Влияние MD checking на производительность и методы уменьшения влияния на работоспособность системы

MD (Multiple Device) — это технология в Linux, которая позволяет объединять несколько физических дисков в один логический накопитель с помощью различных схем RAID (Redundant Array of Independent Disks). mdXXX (далее md disk) — это одино из устройств, созданных с использованием этой технологии. Для определения влияния проверки состояния (checking) массива md disk на производительность системы необходимо рассмотреть несколько аспектов.

https://habr.com/ru/articles/831718/

#mdraid #ionice #cgroups #checking

I'm just going to say it, and we can agree to disagree if you do in fact disagree...

systemd has categorically made Linux better in basically every way imaginable

It's earnestly cool if you don't agree but it's really really good

🤷

#systemd #linux #cgroups #init

Running hot? cgroups!

https://chrichri.ween.de/articles/a2f68e1/running-hot-cgroups

I have 2 #asciicast videos demonstrating the new pid and memory #sandboxing of #SydBox: https://asciinema.org/a/625170 and https://asciinema.org/a/625243 This is a simple alternative to #cgroups when you don't have the rights or support to use them. This is fully implemented with #seccomp and requires no escalated rights! #exherbo #debian #freesoftware #opensource #sandbox #security #hacking

New cool example for ebpf_exporter: CFS delay histogram. In addition to knowing overall CFS throttlig delay from cgroups in cpu.state, now you can have a histogram of individual throttling durations in prometheus.

* https://github.com/cloudflare/ebpf_exporter/pull/311

As a bonus, you get a bpftrace command to observe these.

#ebpf #ebpf_exporter #cgroups #cfs #bpftrace #prometheus

The main problem I was facing wasn't actually related to #riscv although I experimented several times to get the right U-Boot, U-Boot environment, kernel and dtb combo. It ook my a while to figure out how to get #libvirt #LXC working. Was suspecting something missing in kernel config, but in the end it was #cgroups. Had to disable unified hiearchy to get libvirt-lxc working. Weird, but it did the trick. And I also spend quite some time debuging why my minoins can't connect to my #SaltStack master just to realize after few weeks that I changed the #IPv6 subnet on LAN and as my #DNS records are updated via Salt and as the minions can't connect, they were still pointing to the old IP. It was easilly fixed after I got my minions connected again. A little downside of IPv6 is that you easilly miss that one of those many numbers changed.

Note to future self's sanity: #Podman's #quadlet support depends on #cgroups v2.

Using v1 leads to errors like this:

Error: mkdir /sys/fs/cgroup/pids/user.slice/user-1000.slice/user@1000.service/runtime: permission denied

#cgroups

Client Info