Podchaser Logo
Home
Tammy Butow Chats SRE, Chaos Engineering and How to Train On-Call Teams

Tammy Butow Chats SRE, Chaos Engineering and How to Train On-Call Teams

Released Wednesday, 8th April 2020
Good episode? Give it some love!
Tammy Butow Chats SRE, Chaos Engineering and How to Train On-Call Teams

Tammy Butow Chats SRE, Chaos Engineering and How to Train On-Call Teams

Tammy Butow Chats SRE, Chaos Engineering and How to Train On-Call Teams

Tammy Butow Chats SRE, Chaos Engineering and How to Train On-Call Teams

Wednesday, 8th April 2020
Good episode? Give it some love!
Rate Episode

Maintaining availability and velocity in a highly-regulated environment

Tammy talks about what it takes to maintain available, secure services in a highly-regulated environment. See how teams think about their delivery pipelines and services when applications and infrastructure need to adhere to strict Australian governmental regulations.

Tammy's path to SRE and the organizational value of SRE

Through personal experiences, Tammy discovers the value of SRE in a very real way. Tammy talks about why this piqued her interest in site reliability engineering and how she made the move from a full-stack engineer to an SRE. She then elaborates on her journey into SRE and talks about how managers and engineers can get organizational buy-in for SRE and show the value of it over time.

How to train on-call teams for incident response

Incident management, real-time response and on-call efficiency are important to Tammy, and should be for SREs everywhere. Tammy will cover actionable tips for training on-call teams and giving on-call responders the tools and resources they need to make on-call suck less. Tammy also dives into her expertise in skateboarding and how some of the things she learned while skateboarding has made her and her teams better.

Chaos engineering and real applications of it

Tammy discusses the topic of chaos engineering and the intentional injection of failure into your systems – so you can learn from it and make your systems more resilient over time. Through tabletop exercises, gamedays, on-call training and post-incident reviews, Tammy shows how teams can improve both people operations and technical operations around incident response

Reading material mentioned in the show

Tammy's O'Reilly Book: Reducing MTTA for High-Severity Incidents: https://www.oreilly.com/library/view/reducing-mttd-for/9781492046202/
Gremlin's Chaos Engineering Slack Community: https://www.gremlin.com/slack/
Gremlin's Resources for Site Reliability Engineering: https://www.gremlin.com/site-reliability-engineering/
Tammy's Twitter Account: https://twitter.com/tammybutow

Show More
Rate

Join Podchaser to...

  • Rate podcasts and episodes
  • Follow podcasts and creators
  • Create podcast and episode lists
  • & much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.
,

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features