SREcon24 Americas

20 Years of SRE: Highs and Lows
Scam or Savings? A Cloud vs. On-Prem Economic Slapfight
Is It Already Time To Version Observability? (Signs Point To Yes.)
Capacity Constraints Unveiled: Navigating Cloud Scaling Realities
Sharding: Growing Systems from Node-scale to Planet-scale
Product Reliability for Google Maps
Build vs. Buy in the Midst of Armageddon
Compliance & Regulatory Standards Are NOT Incompatible with Modern Development..
The Ticking Time Bomb of Observability Expectations
Synthesizing Sanity with, and in Spite of, Synthetic Monitoring
Migrating a Large Scale Search Dataset in Production in a Highly Available...
OIDC and CICD: Why Your CI Pipeline Is Your Greatest Security Threat
When Your Open Source Turns to the Dark Side
The Sins of High Cardinality
Optimizing Resilience and Availability by Migrating from JupyterHub to the...
99.99% of Your Traces Are (Probably) Trash
Meeting the Challenge of Burnout
What We Want Is 90% the Same: Using Your Relationship with Security for Fun..
Thawing the Great Code Slush
Resilience in Action
Navigating the Kubernetes Odyssey: Lessons from Early Adoption and Sustained...
"Logs Told Us It Was Kernel – It Wasn't"
What Is Incident Severity, but a Lie Agreed Upon?
Hard Choices, Tight Timelines: A Closer Look at Skip-level Tradeoff Decisions...
Triage with Mental Models
Defence at the Boundary of Acceptable Performance
Lightning Talks
System Performance and Queuing Theory - Concepts and Application
It Is OK to Be Metastable
The Art of SRE: Building People Networks to Amplify Impact
Teaching SRE
Cross-System Interaction Failures: Don't Fail through the Cracks
Gray Failure: The Achilles’ Heel of Cloud-Scale Systems
The Invisible Door: Reliability Gaps in the Front End
Automating Disaster Recovery: The Ultimate Reliability Challenge
From Chaos to Clarity: Deciphering Cache Inconsistencies in a Distributed...
Patching Your Way to Compliance with a Small Team and a Pile of Technical Debt
Strengthening Apache Pinot's Query Processing Engine with Adaptive Server...
Taming the Linux Distribution Sprawl: A Journey to Standardization and...
Frontend Design in SRE
Measuring Reliability Culture to Optimize Tradeoffs: Perspectives from an...
Storytelling as an Incident Management Skill
Real Talk: What We Think We Know — That Just Ain’t So
What Can You See from Here?

SREcon24 Europe/Middle East/Africa

Dude, You Forgot the Feedback: How Your Open Loop Control...
You Depend on Time, This Is How It Works and You Won’t...
SRE Saga: The Song of Heroes and Villains
The Frontiers of Reliability Engineering
I Can OIDC You Clearly Now: How We Made Static Credentials a...
OMG WTF SSO: A Beginner’s Guide to Single Sign-On...
Sailing the Database Seas: Applying SRE Principles at Scale
Survivor: MySQL Island – Outwit, Outplay, Outlast Metadata...
Fixing Your Noisy Pager in 500 Easy Steps
Achieving Excellence: SLO Thresholds That Transform Service...
Selective Reliability Engineering: There Is No Single Source...
Why You’re (Probably) Doing Service Catalogs Wrong
Exploring the Unintended Consequences of Automation in Software
Rock around the Clock (Synchronization): Improve Performance...
Mnemonic Rules for Eponymous Laws or: There’s a Law for That!
SRE Stakeholders: A Spotter’s Guide
Panel Discussion: Is Reliability a Luxury Good?
Enhancing Elasticsearch Performance: Innovative Reindexing...
Lessons from Unix History
Treat Your Code as a Crime Scene
Finding the Capacity to Grieve Once More
Incident Groundhog Day
Anomaly Detection in Time Series from Scratch Using...
Generative AI: Beyond (Just) Hype
From PIDs to Pods: The Life Cycle of an eBPF-Autoinstrumented..
Scheduling at Scale: eBPF Schedulers with Sched_ext
When Your SaaS Provider Goes out of Business – Lessons from...
Configuration Languages Are the Bane of Our Existence
Just Buy the Printer: Resilience in Action
Noisy Neighbors, through Networking
Taming Noisy Benchmark Results Using Change Point Detection
Enabling Product Scalability through Load Testing
NVMe/TCP Makes iSCSI Look like Fortran
The Silent Performance Killers: BIOS and Firmware Updates
How a Single API Endpoint Saved Us 3000 CPU
Managing the Risk of Software Supply Chain Attacks
When SRE and Security Teams Meet to Face a Crisis
How to Host a (Very) Popular Website for 30 Altairian...
How Snowflake Migrated All Alerts and Dashboards to a...
What If We Ask Linux to Do Cryptography for Us?
Synthetic Monitoring and E2E Testing: 2 Sides of the Same Coin
Lightning Talks
Monitoring Systems as a Service – Walking the Line between...
An Exploration in Storing Telemetry in Cloud Object Storage
Opening the Box: Diagnosing Operating-System Task-Scheduler...
Embrace Fleet Reboots and Make Them Boring
A Brief History of Release Engineering
Red Tide Revert
Riot Games: Evolution of Observability at the Gaming Company
A Powerful Logs Management Solution We All Have and Use but...
Blast Radius Reduction for Large-Scale Distributed Systems
AppStack: An Open Source Cloud Native Platform for Running...
Science Reliability Engineering for High Performance Computing
Get Your Non-SREs Oncall Ready!
Transforming Production Readiness
Energy Consumption of Datacenters
Are We Really Engineers?

and1truong/2024 - SRE conferences.md

Select an option

No results found

Select an option

No results found

SREcon24 Americas

SREcon24 Europe/Middle East/Africa