Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailThis is a remote position.
Site Reliability Engineer
£55000 £75000 Benefits
(Remote UK)
Our client is an innovative fintech company revolutionising the transfer agency landscape through cuttingedge technology. They simplify complex financial processes reduce operational risks and provide toptier solutions to their global client base.
SRE Our client is seeking an experienced Site Reliability Engineer (SRE) to join their global SRE team. This role is critical in ensuring the health of production environments maintaining Kubernetes clusters monitoring storage systems and identifying and resolving defects through proactive code changes.
Global SRE Team Collaboration:
Collaborate with a globally distributed SRE team to ensure the reliability and scalability of production environments.
Infrastructure Maintenance:
Maintain the infrastructure and services that support cloudbased applications ensuring uptime and optimal performance.
Oversee Kubernetes clusters and ensure their health and reliability.
Automation and Monitoring:
Automate deployment and configuration processes using tools like Terraform and Ansible.
Develop and implement monitoring systems with tools such as CloudWatch Grafana and Prometheus to identify potential issues before they impact users.
Troubleshooting and ProblemSolving:
Identify troubleshoot and resolve defects in production systems making necessary code changes to improve stability and performance.
Handle complex issues related to networking security and application performance.
OnCall Support:
Participate in an oncall rotation to provide 24/7 support for critical production systems.
Must Have:
At least 5 years of experience as a Site Reliability Engineer or similar role.
Strong expertise in AWS services including EKS S3 RDS Lambda EC2 and others.
Proficiency in Kubernetes management and troubleshooting.
knowledge of / able to code Java Python GoLang or Kafka
Handson experience with automation tools such as Terraform and Ansible.
Familiarity with monitoring tools like CloudWatch ELK Stack Grafana and Prometheus.
Comprehensive knowledge of networking and security principles including firewalls VPNs and SSL/TLS.
Excellent troubleshooting and problemsolving abilities.
Strong written and verbal communication skills.
Desirable:
Experience with codelevel debugging and fixes to ensure operational stability.
Familiarity with agile and DevOps workflows.
Financial Services Industry
Competitive salary between £55000 £75000
Company Pension
Generous annual leave allowance
Opportunities to work in a fully remote and collaborative global team.
Comprehensive health and wellness benefits.
Ongoing professional development and training opportunities.
Full Time