Site Reliability Engineer (SRE) - #1710274
Robert Walters

About the Role
Join a global team of engineers, operators, and Agile practitioners responsible for building and operating a world-class Data Loss Prevention (DLP) infrastructure in Glasgow office. This role sits within the Cybersecurity organization and focuses on enhancing observability and telemetry across the DLP stack to support a cloud-first strategy while maintaining strong on-premise capabilities.
This is an exciting opportunity for engineers with strong SRE and monitoring experience, and also a great entry point for professionals looking to transition into cybersecurity.
Key Responsibilities
- Design and maintain Prometheus metrics collection and PromQL queries
- Build, review, and optimize Grafana and Splunk dashboards using observability best practices (e.g., Four Golden Signals, RED methodology)
- Refine alerting rules across tools like PagerDuty, Prometheus, and Splunk to eliminate noise and identify gaps
- Work closely with engineering squads to implement and maintain SLO/SLIs and error budgets
- Operate Prometheus in agent mode and troubleshoot issues
- Use telemetry data to generate actionable insights for the DLP teams
- Drive continuous improvement of monitoring and observability systems
- Participate in a 24/7 on-call support rota for DLP products
- Collaborate in a DevOps and Agile environment
Required Skills and Experience
- Strong hands-on experience with Prometheus and PromQL (3+ years)
- Expertise in Grafana dashboarding (3+ years)
- Solid experience with Splunk dashboarding and queries (3+ years)
- Deep understanding of observability and monitoring principles
- Familiarity with SRE practices, SLO/SLIs, and error budget management
- Experience with PagerDuty, or similar alerting/orchestration platforms
- Fluent in at least one programming or scripting language
- Knowledge of CI/CD tools (e.g., Jenkins, Bitbucket)
- Experience working in cloud environments (AWS or similar) or Unix/Linux systems
- Excellent collaboration, communication, and problem-solving skills
Nice to Have Experience with:
- Cybersecurity or DLP products
- Incident, problem, and change management tools
- OpenTelemetry or telemetry pipeline tooling
- Automation and scripting for monitoring
- Working in Agile or operational environments
Why Join?
- Work on a globally distributed, high-impact security team
- Learn and grow in a DevOps-driven, cloud-first organization
- Transition into cybersecurity or expand your existing expertise
We are committed to offering an inclusive recruitment experience. If you require accommodations because of disability or health condition, please email: gscemeaedi @ robertwalters.com. This position is being sourced through our Outsourcing service line.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Workforce Planning and Information Lead

Senior Planner

Manufacturing Technician - Test - Naval
