L

Site Reliability Engineer at Lulalend

Lulalend
June 10, 2026
Full-time
On-site
OVERALL PURPOSE


We are seeking an experienced Site Reliability Engineer to join our team. The ideal candidate should have a deep understanding of Microsoft Azure, cloud computing, and distributed systems.
As a Site Reliability Engineer, you will be responsible for monitoring, maintaining and improving our Azure-based infrastructure and applications, ensuring their reliability, scalability, and security as well as acting as the technical escalation to both Junior and Intermediate Site Reliability engineers and representing the Site Reliability team in CAB as approver.
You'll also play a key role in guiding reliability practices, mentoring the team, and improving how we operate.


Responsibilities will include:


Monitor system health, alerts, and application behaviour, responding to incidents and contributing to root cause analysis and remediation
Triage and resolve service requests related to cloud infrastructure and applications in a timely manner
Build and continuously improve monitoring and alerting using Azure-native tooling (Azure Monitor, Log Analytics, KQL) to provide meaningful visibility into system performance
Analyze performance, reliability, and usage metrics to identify optimization opportunities and potential risks
Partner with our internal Developers and DevOps teams to build, monitor and manage highly available, reliable, scalable and resilient architectures with high levels of visibility on Azure
Partner with Microsoft to resolve complex remediation and improvement as required in our Azure environment
Identify gaps in logging, metrics, and tracing, and collaborate with developers to improve overall system visibility
Partner with our internal SecOps team to ensure the security of the Azure infrastructure and applications by implementing and enforcing security policies and best practices
Develop and maintain automation scripts and tools to streamline deployment and management of Azure services
Continuously research and evaluate new Azure features and services to optimise our infrastructure and improve our application development workflows
Participate in on-call rotation to provide 24/7 support for critical systems
Act as a monitoring resource on all Changes and Releases happening in your on-call rotation as is required.


THE COMPETENCIES WE'RE AFTER


Strong written and verbal communication skills
Ability to communicate complex technical concepts to non-technical stakeholders
Ability to work independently and as part of a team
A proactive, collaborative and high attention to detail approach to issues
A quick and hungry learner
Highly credible and trustworthy with an open and honest approach
Strong planning skills and ability to prioritise
Adaptable and flexible with resilience to change and ambiguity
Adaptable between proactive and reactive support in real time
Ability to mentor and grow others


THE SKILLS AND EXPERIENCE WE'RE LOOKING FOR


Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
3-5 years experience in a Site Reliability Engineering, DevOps, or Software Engineering role
Strong understanding of Azure services such as Web Applications, Functions and Application Gateways
Experience working with cloud platforms (Azure preferred)
Experience with observability tooling
Experience in monitoring, logging and troubleshooting in Azure using App Insights, Azure Monitor, Log Analytics, Logic Apps and Query Performance measures in SQL Databases
Experience with automation tools such as PowerShell, Azure CLI and ARM templates
Strong troubleshooting and problem-solving skills
Excellent communication and collaboration skills to work with cross-functional teams
Familiarity with CI/CD pipelines and modern DevOps practices (e.g. GitHub Actions, Azure DevOps)


Nice to Have:


Experience with OpenTelemetry or vendor-neutral observability approaches
Experience with microservices or modular monolith architectures
Exposure to performance or load testing practices
Experience with tools such as Grafana and Prometheus or similar platforms