The Incident Review: 4 Times When Typos Brought Down Critical Systems
Sometimes, as these 4 incidents highlight, major failure results from a mere typo or configuration oversight.
November 19, 2024
5 mins
A curated list of sessions to make the most out of re:Invent as an SREwe
Las Vegas is about to be taken over by AWS’s ecosystem from December 2nd through the 6th. Featuring over 2,500 sessions, re:Invent is one of the most ambitious events in the industry. To help attendees navigate the vast schedule, the organizers provide attendee guides designed to highlight talks relevant to different interests.
The official attendee guides categorize sessions by industry vertical (e.g., Automotive, Finance, Telecom), by topic (e.g., Data, DevOps, GenAI), or by attendee role (e.g., Cloud Admins, Product Managers, Executives). While there are 30+ guides available, none specifically cater to SREs.
Our team of reliability experts reviewed this year’s re:Invent session catalog to find the most relevant talks for SREs working with AWS. We’ve organized our guide in the same format AWS uses guides to make them easier to follow. Remember to bookmark your favorite talks and reserve a spot to make sure you get in.
At re:Invent, breakout sessions are the name given to more traditional presentations where industry leaders and AWS experts deliver insights on a topic. These sessions are typically recorded and made available post-event.
Learn how to tackle the increasing complexity of CI/CD pipelines in this practical session led by Gunnar Grosch, Principal Developer Advocate at AWS, Johannes Koch, Sr Engineer Technical Architecture at FICO, and Thorsten Hoeger, Cloud Automation Evangelist at Taimos GmbH.
Dive into best practices for ensuring consistent deployments, automated auditing, and robust configuration management across environments. Whether you’re scaling up delivery or addressing reliability challenges, this session equips you with actionable strategies for enhancing efficiency and reducing errors.
Wednesday, Dec 412:00 PM - 1:00 PM PSTMGM Grand | Level 1 | Grand 116
Understand how to optimize observability for emerging generative AI architectures like LLMs and RAG by the hand of Denis Batalov, AI/ML Tech Leader, and Greg Eppel, CloudOps Tech Leader, both from AWS.
This session unpacks practical tools and frameworks, including Amazon CloudWatch and LangChain, to help you gain deep visibility into your AI workloads. Learn how to design observability practices that ensure reliability, performance, and transparency at every stage of your generative AI lifecycle.
Monday, Dec 21:00 PM - 2:00 PM PSTMandalay Bay | Level 3 South | South Seas F
Builders’ sessions are Interactive and hands-on, which means you need to bring your laptop to follow along. During a Builders’ Session, AWS experts lead small groups through tools and strategies to solve specific challenges.
Discover how generative AI can transform log analysis for web applications. In this hands-on session, Jibril Touzi and Jonathan Woods, Solutions Architects at AWS, will show you how to leverage Amazon Bedrock to extract actionable insights from Amazon CloudFront logs. This session is tailored for SREs seeking advanced techniques to proactively address anomalies and streamline their operations.
Thursday, Dec 511:00 AM - 12:00 PM PSTMGM Grand | Level 3 | 350
Highly interactive discussions that start with a brief lecture, followed by an open-format Q&A, encouraging audience participation and problem-solving on real-world architecture challenges.
Explore how to design and implement applications that deliver unmatched reliability using the AWS resilience lifecycle framework. This session, led by Diego Dalmolin, Solutions Architect at AWS, dives into actionable techniques for minimizing downtime and recovering from failures swiftly, ensuring customer-facing applications meet the highest availability standards.
Monday, Dec 29:00 AM - 10:00 AM PSTCaesars Forum | Level 1 | Summit 221
Elevate your observability practices with techniques for integrating metrics, traces, and logs across AWS environments. This session, led by Felix Mezo Gomez and Jon Steele from AWS offers practical guidance on optimizing performance monitoring, reducing overhead, and ensuring cost-efficiency—making it an essential talk to reduce operational overhead while gaining deeper visibility into your applications' performance and health.
Monday, Dec 24:30 PM - 5:30 PM PSTMandalay Bay | Lower Level North | South Pacific D
Lighting talks in re:Invent aren 20 mins long, which allows them to pack concise insights while covering a lot of ground.
Gain a comprehensive understanding of how to monitor and troubleshoot Amazon EKS workloads effectively. In this session, Steven David, Principal Software Architect at AWS, introduces practical observability strategies leveraging AWS tools like Amazon CloudWatch and Logs Insights to ensure the health and performance of containerized applications.
Thursday, Dec 511:30 AM - 11:50 AM PSTVenetian | Hall B | Expo | Theater 3
Amazon CloudWatch Application Signals can simplify the process of collecting and analyzing critical application signals from Amazon EKS, Amazon ECS, and Amazon EC2 environments. This session led by Siva Guruvareddiar, Senior Specialist Architect at Amazon, demonstrates how to enhance reliability and performance by analyzing critical application signals, providing SREs with the insights needed to optimize operations in real time.
Monday, Dec 26:30 PM - 6:50 PM PSTVenetian | Hall B | Expo | Theater 1
Rootly is an AWS partner trusted by LinkedIn, Dropbox, NVIDIA, Cisco and hundreds customers. We offer on-call and incident response solutions that you can buy through the AWS Marketplace.