Xero Senior Engineer - SRE (Reliability Enablement), Melbourne CBD, Melbourne

This is a Senior Engineer - SRE (Reliability Enablement) role with one of the leading companies in AU right now Xero with an amazing team. They are continuing to grow rapidly. This is the chance to join right as the takes off. More About the Role at Xero About the team Reliability Enablement (AKA Reliability Rangers) As a member of our Reliability Enablement team at Xero, you’ll help teams deliver a great customer experience through a better understanding of the behaviour and operation of their systems. We do this through a focus in post incident analysis and advocating for learning from incidents, as well as engaging with teams across the organization with specialized reliability enablement and consulting, and running SRE workshops and training. There will be a lot of variety to your work as a part of reliability enablement, as you may be embedded within an engineering portfolio, or ‘home’ in our central reliability enablement team. Regardless of your current focus you will be an advocate for reliability and incident learning, as well as an active member of our SRE On Call function, providing specialist incident commander capabilities for complex major and critical incidents How you’ll make an impact - When you are ‘home’ in the central reliability enablement team you could be doing any combination of the following. - Investigating operational surprises and supporting teams in post incident activities. - Conducting in depth incident analysis and maximizing post incident learning across the organization - Complete short term reliability consultancy and enablement engagements such as SLO reviews and facilitating pre-mortems. - When you are an embedded SRE you could spend several months immersed with a product engineering portfolio, working alongside teams to uplift system reliability and robustness through the following. - Improving on call health, uplifting observability and addressing any operational hotspots - Identifying, planning and leading implementation of reliability uplift work and initiatives - Support delivery of strategic features and initiatives with reliability and distributed systems expertise - Observing and improving rituals and practices relating to production operations, incident response and incident learning What you’ll bring with you - Required - Solid experience in logging, monitoring and observability of a highly distributed system - Leading incident management and response and troubleshooting efforts, including critical, complex and high severity incidents - Post incident reviews, incident analysis and learning from incidents - Experience working in a tech or product company with comparable scale and complexity - Systems thinking and thinking about how systems and components interact, how they respond to failure - Proficiency in one or more object-oriented programming languages (C#, JavaScript, Java, Python etc) or experience with infrastructure-as-code (e.g. Terraform, Cloudformation) - Preferred - Experience working with cloud providers such as AWS, Azure or GCP - Experience with designing, developing and operating distributed systems and large scale software systems - Strong experience delivering technical initiatives in an operational, site reliability or platform engineering capacity - The ability to solve engineering challenges outside of your own team, including using influence rather than authority to enact change - Demonstrated experience in reliability concepts like capacity management, autoscaling, deployment and release safety, software strategies for reliability, fault tolerance and graceful failure - Experienced in implementing customer focused Service Level Objectives (SLOs) - Experience using software engineering to solve operational and reliability challenges - Understanding of human factors, safety science and resilience engineering - Experience working in environments with advanced security and networks If you don’t think you're a perfect fit, you should still sign up to Hatch and create a profile, we'll match you to other roles that suit your profile. Hatch exists to level the playing field for people as they discover a career that’s right for them. We model this in our hiring process for our partners like Xero. ✅ Applying here is the first step in the hiring process for this role at Xero. We do not discriminate on the basis of gender identity, sexual orientation, cultural identity, disability, age, or any other non-merit factors. To put it simply, Hatch is for everyone.

Senior Engineer - SRE (Reliability Enablement) — Melbourne CBD, Melbourne Expired

Senior Engineer - SRE (Reliability Enablement) — Melbourne CBD, Melbourne
Expired