Observability Platform Lead
Job description
Let’s create a more sociable future together
At Endeavour, we’re totally into what we do. With a portfolio that includes Dan Murphy’s, BWS, ALH Hotels, Pinnacle Drinks and more, we love to bring people together. Together we share our passion for our products and industry; it’s what inspires us to dream big, and continue to create new experiences for our customers and teams across Australia. If you thrive on positive energy, we want to meet you!
- Bring your passion and feel the energy
- This is just the start, so dream big
- Flexible/ Hybrid working
The Observability Tech Lead role is responsible for driving the implementation and ongoing enhancement of observability practices within Endeavour Group. This role ensures the EGL applications are functioning optimally and any issues can be quickly identified and mitigated. The Observability Tech lead plays a critical role in the ability to understand and monitor the internal state of the EGL applications based on the data generated by observability tools (like logs, metrics, traces, etc.). In this role, the individual would be focused on building a robust monitoring, logging, and tracing infrastructure, ensuring that the applications are transparent, reliable, and easy to debug.
Sound good? Read on.
Here is a taster of what you can expect in this role:
Strategic Vision & Planning
Define observability goals: Develop a clear vision for Observability, ensuring alignment with the EGL’s operational, product and service delivery goals.
Tech stack decisions: Select, evaluate, and implement observability tools and frameworks (e.g., Dynatrace, Prometheus, Grafana, Datadog, OpenTelemetry, Jaeger, ELK Stack).
Roadmap development: Plan out the evolution of observability practices in line with the Endeavour Group’s growth and changing infrastructure.
Architecture & Design
Design observability infrastructure: Lead the design and implementation of centralized logging, monitoring, and tracing infrastructure.
Instrumentation design: Work with EGL’s application teams to ensure proper instrumentation is in place across applications and services (ensuring consistency and completeness in metrics, traces, and logs).
Integration with CI/CD pipelines: Ensure observability features are integrated into the CI/CD pipeline to promote best practices in automated monitoring and alerting.
Cross-functional Collaboration
Mentoring teams: Guide EGL’s application teams in understanding observability best practices, tool usage, and troubleshooting methods.
Collaborate with DevOps/SREs: Work closely with DevOps, Site Reliability Engineers (SREs), and other stakeholders to create systems that are not only observable but also resilient and scalable.
Incident response: Lead efforts in incident postmortems, working with teams to understand failure scenarios and refine observability to help prevent similar issues in the future.
Metrics, Alerts, and Dashboards
Define key metrics: Determine critical metrics (such as latency, error rates, throughput, resource utilization, etc.) and set up appropriate collection, storage, and alerting for those metrics.
Alerting & incident management: Implement intelligent alerting that reduces noise and ensures timely responses to issues, while continuously refining alert thresholds.
Dashboards: Design and maintain intuitive, actionable dashboards for different stakeholders, including engineers, product teams, and executives.
Performance & Optimization
Performance tuning: Continuously monitor performance metrics and work with engineering teams to optimize code and infrastructure.
Cost optimization: Help manage and optimize costs related to observability, ensuring that tools and services used for monitoring and logging are efficient and not over-provisioned.
Continuous Improvement & Automation
Automate observability: Automate the deployment, scaling, and management of observability tools to ensure they're available when needed and reduce manual overhead.
Continuous feedback loop: Iterate on the observability system based on feedback from incident responses, performance metrics, and evolving requirements.
Security and Compliance
Data privacy: Ensure that observability data (logs, traces, and metrics) complies with legal and regulatory requirements, especially in areas like data retention and user privacy.
Security: Make sure observability tools and processes are secure and that access to sensitive operational data is restricted appropriately.
Team Leadership & Development
Leadership and mentoring: Mentor and grow a team of observability engineers or specialists, promoting best practices, innovation, and a culture of continuous learning.
Documentation & training: Ensure proper documentation is in place for observability systems, procedures, and troubleshooting methods to ensure teams can use them effectively.
Now let’s talk about you:
Tool Expertise: Proficiency in observability tools like Dynatrace, Prometheus, Grafana, Datadog, New Relic, OpenTelemetry, ELK Stack, Jaeger, etc.
Technical Expertise: Significant experience in software engineering, system design, or DevOps, with at least 5 years focused on observability and monitoring in production environments.
Platform Proficiency: Strong understanding of distributed systems, microservices, and cloud architectures.Familiarity with container orchestration platforms (e.g., Kubernetes) and cloud platforms (AWS, GCP, Azure).
ITIL Proficiency: Solid understanding of ITIL 4 principles, particularly in incident, problem, and change management processes.
Data Analysis and Reporting: Skilled in creating reports and dashboards to support data-driven decision-making.
Automation and Scripting: Familiarity with JavaScript, Glide API, and ServiceNow automation workflows to streamline IT processes
The benefits are good too!
- We offer flexible working in every sense
- An exclusive discount card for BWS, Dan Murphy’s, Woolworths, BIG W and other Endeavour Group brands, including our ALH pubs
- Monthly meeting-free days
- Your health and wellbeing is your most important asset, and as one of our valued team members, it’s our first priority. You will have a range of free services to help you live well and support your physical, mental and financial wellbeing
- Endeavour Group is full of opportunities - use our dedicated learning and development options to grow an idea, yourself, and your career. This is just the start, so dream big.
At Endeavour, we value being a workplace where everyone’s welcome - if you meet a number of the requirements (and not all), we encourage you to apply.
We are together creators
With a portfolio that includes Dan Murphy’s, BWS, ALH Hotels, Pinnacle Drinks and more, Endeavour Group is big on sociability. Together we create the moments that bring millions of people together. And together we have more fun, create more opportunities, and score a lot more goals. We’re serious about creating a safe, inclusive and fun place to rock up to where equal opportunity is key, and flexibility is part of how we roll.
We’re all about creating a more sociable future - for our customers and each other. If this job excites you - and you’re close-enough on the requirements, reach out, we’d love to hear from you.
You can learn more about working with us on LinkedIn or at endeavourgroupcareers.com.au.
Our Talent Team and Hiring Leaders kindly request no unsolicited resumes or approaches from Recruitment Agencies. Endeavour Group is not responsible for any fees related to unsolicited resumes.
#WeAreTogetherCreators #ComeAsYouAre #DreamBig #FeelTheEnergy #LeaveYourMark #EndeavourGroup
Related roles
Salary
Location
Surry Hills, NSW, 2010
Departments
Infrastructure & Operations
Locations
NSW
Work Type
Fixed term full-time
Work Style
Onsite
Brand Mapping
Endeavour Group
Description
3 days onsite - work from officeFixed-term contract role of 12 months + possible extension to another 12 monthsThe Data Solutions Architect will be responsible for assisting Endeavour Group in transit
Reference
012242d4-1ef9-4a42-93a1-875916a655c5
Expiry Date
01/01/0001
Salary
Location
Surry Hills, NSW, 2010
Departments
Infrastructure & Operations
Locations
NSW
Work Type
Fixed term full-time
Work Style
Onsite
Brand Mapping
Endeavour Group
Description
Bring your passion and feel the energy Flexible / Hybrid ways of workingFixed-term role - 2.5 years (30 months)The EDI and SAP BTP Supplier Engagement Portal Project Manager will oversee all aspects o
Reference
9d521953-58b1-4128-a4fe-82ae4c7cdb57
Expiry Date
01/01/0001
Salary
Location
Surry Hills, NSW, 2010
Departments
Infrastructure & Operations
Locations
NSW
Work Type
Full-time
Work Style
Onsite
Brand Mapping
Endeavour Group
Description
Be uniquely you, come as you are Bring your passion and feel the energy Hybrid/ Flexible workingA Senior EUC (End User Compute) Engineer POS and Mobile play a pivotal role in designing, implementing,
Reference
82f09a92-e102-4ee7-bb91-4644c0262ffb
Expiry Date
01/01/0001