Get Rootly's Incident Communications Playbook

Don't let an incident catch you off guard - download our new Incident Comms Playbook for effective incident comms strategies!

By submitting this form, you agree to the Privacy Policy and Terms of Use and agree to sharing your information with Rootly and Google.

Back to Blog
Back to Blog

November 4, 2024

4 mins

The Unofficial SRE Track for KubeCon NA '24

KubeCon doesn’t have an SRE track, so we’ve gone through the 300+ talks so you don’t have to. We picked the ones that we find more inspiring for reliability folks.

JJ Tang
Written by
JJ Tang
The Unofficial SRE Track for KubeCon NA '24
Table of contents

KubeCon North America is just around the corner, featuring over three hundred talks across four days. Deciding which sessions to attend in Salt Lake City requires careful planning to make the most of your time. To help with that, we’ve curated a list of talks particularly relevant for SREs. From case studies of reliability at scale to the relationship between AI and SRE, we hope you find interesting talks to add to your KubeCon schedule.

Rootly will also have a big presence at KubeCon. With a talk during Platform Engineering Day, a booth in the Solutions Showroom, a co-hosted lunch with Spotify on Wednesday, and a happy hour on Thursday. You’ll find all the details at the end of this article.

Case Studies on Scaling Reliability

Unfortunately, both case studies on scaling a company’s reliability are scheduled at the same time, so you’ll need to choose which scale is more interesting to you. Will you explore how a titan like Global Payments ensures system reliability, or learn how a rapidly scaling fintech like Cash App develops new strategies to prevent outages?

Global Payments: Setting New Standards for Reliability in Cloud-Native Multi-Region Applications

How do you ensure over 32 billion card transactions go through securely every time? Trey Caliva, Principal Cloud Architect at Global Payments, will introduce us to the architecture behind their multi-region setup on GCP with Kubernetes and CockroachDB.

When: Wednesday, November 13, 2024, 3:25 pm - 4:00 pm MST

Where: Salt Palace | Level 1 | 155 B

Add Global Payments' talk to your schedule

Cash App's Journey into a Multi-Cluster Ecosystem

Rachel Sheikh, Software Engineer at Cash App, will showcase how the company scaled up its reliability strategy. The Cash App team introduced a new paradigm for their Kubernetes clusters that allows services to transition in and out while providing guardrails against common outages.

When: Wednesday, November 13, 2024, 3:25 pm - 4:00 pm MST

Where: Salt Palace | Level 1 | Grand Ballroom H

Add Cash App’s talk to your schedule

The Bleeding Edge

The evolution of AI is reshaping how SRE teams operate, potentially unlocking new possibilities in the tracing arena. However, AI is not a panacea but yet another system that SREs must manage.

Cognitive and Self-Adaptive System for Effective Distributed Tracing in Applications

Traditional tracing solutions often prioritize common traces, making rare traces invisible. Yet, these rare traces are precisely what can be crucial for diagnosing API failures. Mitul Tandon and Akash Gusain will discuss an AI-based tracing solution that treats all traces equally, leading to improved MTTR through more effective diagnoses.

When: Thursday, November 14, 2024, 11:55 am - 12:30 pm MST

Where: Salt Palace | Level 1 | Grand Ballroom B

Add this AI-based tracing talk to your schedule

Optimizing LLM Efficiency One Trace at a Time on Kubernetes

LLM deployments are vast and complex, presenting new challenges for SREs. How do you identify which part of the system is draining resources or causing performance issues? Seema Saharan, SRE at Autodesk, and Aditya Soni, DevOps Engineer at Forrester, will dive into what it means to improve the efficiency of an LLM deployed with Kubernetes.

When: Tuesday, November 12, 2024, 12:55 pm - 1:20 pm MST

Where: Salt Palace | Level 2 | 255 B

Add this talk on making LLMs more reliable to your schedule

Actionable Insights for SREs

Many KubeCon talks inspire you to challenge assumptions and spark creative ideas. But once you’re back at work, it’s also valuable to have actionable knowledge you can apply immediately.

Tutorial: OpenTelemetry Hands-on - Automatic and Manual Instrumentation for Java and Python Apps

OpenTelemetry, also known as OTel, has become the standard observability framework in recent years. In this tutorial, you’ll learn how to instrument Python and Java applications with OpenTelemetry.

When: Friday, November 15, 2024, 11:00 am - 12:30 pm MST

Where: Salt Palace | Level 1 | Grand Ballroom G

Add OTel tutorial to your schedule

Kubernetes Upgrades: Less Pain, More Gain (and Maybe a Little Swearing)

Upgrading Kubernetes is a recurring pain point for DevOps and SREs. In this talk, Jago Macleod, Engineering Director at Google, will discuss strategies to simplify the process and achieve more reliable rollouts.

When: Friday, November 15, 2024, 11:55 am - 12:30 pm MST

Where: Salt Palace | Level 1 | Grand Ballroom H

Add this talk on Kubernetes upgrades to your schedule

Rootly in Salt Lake City

Rootly’s KubeCon Booth

Meet with our reliability experts in the Expo Showroom. Our booth is located in the Startups Pavilion—look for Booth Q47 on the venue map.

Rootly at the Platform Engineering Day

Our very own Jorge Lainfiesta will be giving a talk with Abby Bangser, CNCF Platforms WG Chair, on how to make platforms and portals easier to maintain, scale, and use in the long run. Add the talk to your schedule.

Rootly x Spotify Lunch

We’re partnering with Spotify to enhance your lunch experience at KubeCon. Join us for an exclusive engineering leadership lunch on Wednesday, November 13. RSVP now—spots are limited.

Rootly’s Happy Hour

Alongside Snowflake, Panther, and Infiscal, we’re hosting a KubeCon Engineering Leaders Happy Hour on Thursday, November 14. RSVP now—spots are limited.