A Day In The Life Of A Site Reliability Manager At Google
A Site Reliability Manager's day at Google begins early, addressing "ongoing incidents" and coordinating with global teams to resolve them, ensuring the allocation of necessary resources. Much of the work involves long-term planning, "organizing what lots of people are going to do over the course of many, many months," alongside typical managerial responsibilities, with coding and documentation taking a secondary role.
International Collaboration, Incident Management, Project Management, Leadership and Mentorship, Technical Skills
Advizer Information
Name
Job Title
Company
Undergrad
Grad Programs
Majors
Industries
Job Functions
Traits
David Fayram
Site Reliability Manager
University of California, Santa Barbara
None
Computer Science
Energy & Utilities, Technology, Advertising, Communications & Marketing
Cyber Security and IT
Took Out Loans, Worked 20+ Hours in School, LGBTQ
Video Highlights
1. A Site Reliability Manager (SRM) at Google works internationally, collaborating with colleagues across the globe, often starting the day early to address incidents.
2. A large part of the job involves coordinating resources (SREs and software developers) to resolve issues and analyzing the long-term implications of incidents.
3. The role combines technical problem-solving, including coding and documentation, with significant managerial responsibilities such as meetings, career development discussions with team members, and strategic planning.
Transcript
What does a day in the life of a Site Reliability Manager look like?
What does a day in the life of a Site Reliability Manager look like? It's an international job, so we have to work with counterparts, mostly in Europe, but all over the world. We're up early because there are ongoing incidents.
We reconcile those, take ownership of them, and ensure the resources needed to solve the incident are directed towards it. This involves coordinating amongst the SREs and also with software developers who help us troubleshoot and solve problems.
Then, we always have to consider the long-term implications of whatever just happened and address those. Generally, I get to have breakfast after that. I have a lot of meetings; that's just how it is.
Much of the work for more experienced software engineers at big firms involves organizing what many people will do over the course of many months. Along with that, there's the one-to-one manager work you would expect: talking with individuals, working with them about their careers, making sure their work is going well, and that they have what they need to do it. These are the things I spend most of my time on as an SRM.
I do get to write code and documents, though a lot more documents than code every so often. I wish it was more, but I get paid to think about the people first, so that's where my attention lies.
