DevOps Foundations: Site Reliability Engineering

1h 20mAdvanced2022-04-14

Authors

Ernest Mueller

Director of Engineering at Six Nines IT

James Wickett

Security Engineer and supporter of rugged software and DevSecOps

Course details

Site reliability engineering (SRE) is an emerging paradigm in DevOps. The biggest names in tech—companies like Google, Netflix, Microsoft, and LinkedIn—all use SRE. In fact, industry wide, "site reliability engineer" is replacing "DevOps engineer" in job posts. Simply put, SRE is software engineering applied to operations—for the cloud native era. This course introduces the basics of site reliability engineering, including how SRE fits into DevOps and how it can be integrated into your unique business environment. Instructors Ernest Mueller and James Wickett cover the major areas of expertise, including release engineering, change management, incident management and retrospectives, self-service automation, troubleshooting, performance, and deliberate adversity. Learn how to define reliability through SLAs and SLOs, handle crisis, design distributed systems, and scale your systems and your team. Plus, explore time and project management strategies that bring humanity back to the SRE's job.

Learning objectives
Site reliability engineering basics
Release engineering
Change management
Incident management
Postmortems
Troubleshooting
Distributed design
Organization

Skills covered

DevOps FoundationsServer AdministrationDevOpsFoundationsNetwork and System Administration

Concepts

0. Introduction

01 - Reliability engineering basics
02 - What you should know

1. SRE Basics

03 - Your job as a DevOp
04 - You aren't Google or Netflix

2. SRE Practice Areas

05 - Release engineering
06 - Change management
07 - Self-service automation
08 - SLAs and SLOs
09 - Incident management
10 - Introducing postmortems
11 - The postmortem process
12 - Troubleshooting
13 - Performance engineering
14 - Capacity and scalability
15 - Distributed design
16 - Deliberate adversity

3. SRE Organization

17 - Organizing SREs
18 - The softer side of SRE

Conclusion

19 - Next steps