Description:
This is a Lead & Senior Site Reliability Engineers (Technical Duty Officers) role with one of the leading companies in AU right now -- Xero -- with an amazing team. They are continuing to grow rapidly. This is the chance to join right as the takes off.
More About the Role at Xero
**About the team** Xero’s Incident and Problem Management team are a part of the Site Reliability Engineering (SRE) organization and are responsible for the build, delivery and ongoing maintenance of robust process and tooling around Incident management. The team is responsible for driving enduring reliability at Xero through robust, consistent and fast response to high severity incidents. They are responsible for building a world class process and ensuring that process matures as the demands of the business grows. **About the roles** We're looking to hire multiple roles at Lead Engineer & Senior Engineer level. These positions require experienced SRE professionals with a strong technical background, deep experience in SRE, a passion for building and delivering robust processes, and extensive experience of leading technical response to high severity cloud issues. They will drive best practice across the business and contribute to the ongoing transformation of the Xero SRE culture. As expert communicators, they will lead technical discussions to identify and track actions associated with and identified during incident situations. Across our SRE function, we're looking for those who are keen to deep dive into causes of incidents and proactively examine the potential causes of future incidents; working with engineering teams to remove the risk of that failure scenario. Ultimately building playbooks and automation to ensure quick and effective responses. In addition, provide ongoing training across the business to ensure the process is well understood and adhered to. These roles will form the backbone of a new team, providing a Technical Duty Officer (TDO) function within the business. TDO’s are incident commanders who use SRE skillsets to drive fast mitigation and enduring resolution of impactful events. ### **What you'll do:** - Own the incident management process, ensuring it drives enduring reliability across all products and services within Xero. - Provide expert leadership during critical outages, coordinating multiple teams to ensure streamlined decision-making and quick resolution. - Lead and advocate for the transformation to a world-leading SRE organization, promoting SRE principles within the Engineering Department. - Promote a customer-focused approach by addressing and mitigating global customer environment issues, and fostering a culture of continuous learning and technical excellence within the SRE team. - Develop and implement scalable process frameworks and observability strategies to ensure rapid problem diagnosis, response, and service reliability. - Collaborate with product teams to thoroughly analyze failures and integrate insights to improve service reliability, scalability, and operational efficiency. ### **What you'll bring:** - Previous career experience as a Site Reliability Engineer, in an Operations or Engineering environment - Hands-on experience troubleshooting AWS hosted services - Networking knowledge and able to troubleshoot TCP/IP, SSL/TLS, DNSSEC, IPsec, and BGP issues. - Coding experience (preferably Python) building tools, scripting, or automation - Strong communication (oral & written) skills including the ability to translate technical issues/concepts into agreed actions
If you don’t think you're a perfect fit, you should still sign up to Hatch and create a profile, we'll match you to other roles that suit your profile.
Hatch exists to level the playing field for people as they discover a career that’s right for them. We model this in our hiring process for our partners like Xero.
✅ Applying here is the first step in the hiring process for this role at Xero.
We do not discriminate on the basis of gender identity, sexual orientation, cultural identity, disability, age, or any other non-merit factors. To put it simply, Hatch is for everyone.