Incident management best practices, guides, and product updates from Rootly
Follow us on Twitter
What SREs can learn from the CircleCI security incident of January 2023.
Tips for deciding how many SREs your company should hire.
Millions of Canadians offline. For SREs, the Rogers outage is a lesson in the importance of testing updates, building redundant infrastructure and having a crisis communications plan.
SREs face multiple challenges while their platform becomes available in different locations on the globe. One step in overcoming them is building a solid monitoring system to enable that.
Totally preventing all incidents is not only unrealistic. It’s actually undesirable in some respects.
Best practices for “SRE pioneers” – meaning engineers who are the very first SREs hired at an organization.
A look at the Atlassian outage of April 2022, and what it stands to teach Site Reliability Engineers. A lot to unpack here.
Our co-founder JJ reflects on building the fastest-growing incident management platform and the surprising learnings.
A comparison of the two main SRE team models: Embedded SREs vs. standalone SRE teams.
An overview of the similarities and differences between Site Reliability Engineering and Platform Engineering, including from a career perspective.