#reliabilityengineering

craque sprung 🏳️‍🌈dtauvdiodr@c.im
2025-05-06

Here's a new blog post from me! It's a small "book" review, which is actually a workbook with some essays at the beginning.

The short book is Maj. John Schmitt's exercise book on Tactical Decision Games (TDGs) for the Marines (and likely other branches), and I noticed how much the philosophy behind them is shared with the sorts of Practice of Practice games we play to understand the system and prepare ourselves for incidents.

sounding.com/2025/05/02/schmit

#SRE #TDG #PracticeOfPractice #TabletopExercises #OperationalReadiness #IncidentResponse #TacticalDecisionGames #Resilience #ResilienceEngineering #ReliabilityEngineering

2025-03-26

My second report from #SREcon: Some of the same lessons -- and unsolved problems -- from supporting #machinelearning apps in production carry over to #generativeAI apps, but not all. Attendees discussed the similarities and important differences. #reliabilityengineering #ML #GenAI #LLM #AI #MLOps #LLMOps techtarget.com/searchitoperati

PPC Landppcland
2024-12-24

Google SREs reveal how search handled record World Cup traffic spike: Google's Search Reliability team discusses challenges, strategies, and successes in maintaining service during peak events. ppc.land/google-sres-reveal-ho

craque sprung 🏳️‍🌈dtauvdiodr@c.im
2024-11-04

Perhaps you need a distraction, why not read about the importance of making Documentation part of your workflow?

sounding.com/2024/11/01/to-doc

#SRE #ReliabilityEngineering #DevOps #Observability #ProductionReadiness

craque sprung 🏳️‍🌈dtauvdiodr@c.im
2024-10-03

Hey y'all! Any SREs out there?

Any *aspirational* SREs out there?

Maybe there's a team at work who they call "SRE" and you're really not sure what they do if it's not infra/platform/deploy?

A coworker is interested in switching careers to SRE and asked me for reading recommendations. So I put together the most solid top-five for me, from my experience and perspective of doing Ops for 30 years and SRE for 12:

sounding.com/2024/10/03/five-r

#SRE #ContinuousLearning #ResilienceEngineering #ReliabilityEngineering #DevOps #TechOps #HumanError #ComplexSystems #SystemsThinking

Chris GeoghooliganVTDARKSIM@toot.community
2024-07-15

Anyone looking for a (manufacturing-related) #Reliability #Engineer?

• BS #MechanicalEngineering @ #VirginiaTech
• MS #MarineEngineering @ US #MerchantMarineAcademy #USMMA
• Starting a MS #DataAnalytics @ #GeorgiaTech

• 8y in Marine Engineering
• 10y in #ReliabilityEngineering
• Looking to leverage #RE experience on new data analytics path
• Love #PowerBI

I’m in Northern VA, USA #NoVa w/ few local options; #remotework ✅, some travel ✅.

(Note: My RE is NOT site-RE or DevOps)
#GetFediHired

2024-05-19

news.ycombinator.com/item?id=4

Clear-eyed explanation of disastrous fragility in current technical systems, yet it’s also very funny. Plus discussion. Well worth your time, and good inspiration for systems builders/sysadmins.

#sysadmin #antifragile #reliabilityengineering #softwaredevelopment

Thomas Strömberg ∴ KD4UHPthomrstrom@triangletoot.party
2024-05-17

👋 My last #introduction was in 2022, so here's an update:

- Security Squad Lead at #Chainguard
- Keenly interested in #InfoSec and #ReliabilityEngineering
- 30 years of experience messing with the Internet & UNIX
- I build bamboo bicycle frames & spend more time tinkering than riding
- Spend my idle time playing #guitar and wandering on 2-wheel EVs
- Live in #Carrboro NC with my wife & kids
- Contributed to 150+ #OpenSource projects including 50+ I've created - #malcontent is my latest.

2024-02-26

If you’re making an app that doesn’t even need cloud connectivity in the first place, but forces you to log in and then your remote auth server is down and you can’t use the app, I have three pieces of advice for you:

#cloud #cloudcomputing #bluetooth #iot #reliability #reliabilityengineering #softwaredesign #software

Cameron's pseudocode guide to comprehensive error handling

int main()
{
while(true)
{
try
{
Application.Run()
}
catch (Exception)
{
MessageBox("Whatever you just did, don't")
}
}
}

#Programming
#SoftwareDevelopment
#ReliabilityEngineering

2023-02-02

The bathtub curve is such a great heuristic to predict device failures, and this latest data set on hard drive failures from Backblaze bears that out
#reliability #reliabilityengineering
arstechnica.com/gadgets/2023/0

2023-01-30

As a connoisseur of technology-gone-wrong stories, this is why the words "it's just a configuration change" are a big red flag for me.

It often means that the side effects of said change have not been fully thought through.

#changemanagement #reliability #reliabilityengineering
zdnet.com/home-and-office/work

Adriana Villela 🇧🇷🇨🇦adrianamvillela@hachyderm.io
2022-12-16

Happy Friday!! In case you missed yesterday’s webinar on What 2022 Taught Us About SRE’s Future, you can catch the recording here: ⬇️⬇️⬇️

youtu.be/o0Pgo2fWIUc

The panel includes @anamedina, @austinlparker, KC Tessarek, and Chad Beaudin, Mitch Ashley, and me.

#SiteReliabilityEngineering #observability #reliability #ReliabilityEngineering

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst