Candidate can be based anywhere in the EU area
Overview:
The Site Reliability Engineer plays a crucial role in ensuring the reliability scalability and performance of our systems and applications. Working in a fully remote capacity within the EU this position is integral to maintaining the operational excellence of our technology infrastructure.
Key Responsibilities:
- Collaborate with engineers throughout the company to assist in troubleshooting issues deploying solutions and increasing their productivity.
- Design and optimize both public and internal cloud infrastructure using tools like Terraform Ansible and Git.
- Manage cloud computing Kubernetes and database projects.
- Contribute to and maintain a selfhealing highly available and scalable system.
- Continuously improve CI/CD pipelines and the monitoring stack.
- Monitor and ensure the health of all environments setting up alerts and responding to incidents.
- Participate in oncall rotations incident responses and postmortems.
- Learn new technologies quickly and adapt to a fastpaced environment.
Required Qualifications:
- Collaborative team player with crosscultural experience.
- Excellent communication skills for conveying technical concepts sharing knowledge effectively and crafting clean documentation.
- Adaptability to evolving technologies and dynamic environments.
- Strong analytical and troubleshooting skills.
- Experience with AWS Kubernetes Terraform Ansible Git and monitoring tools.
- Experience with largescale distributed systems and cloud computing.
- Proficiency in infrastructureascode GitOps and CI/CD
- Must have an interest in Fintech and cryptocurrency.
Our stack:
- Main stack: AWS Kubernetes PostgreSQL Kafka
- Configuration: Terraform Ansible
- Observability: Opensearch Prometheus Grafana Opentelemetry
- CI/CD stack: GitOps Gitlab ArgoCD
- Networking: Certificate management DNS
postgresql,kafka,fintech,git,cd,cloud computing,terraform,ci/cd,monitoring,aws,kubernetes,ci,cryptocurrency,ansible