Job Title: Site Reliability Engineer (SRE)
Location: Lisbon Portugal
Work regime: Hybrid (3 times a week in the office)
You wil will integrate a global team supporting and delivering a worldclass global execution platform. This role is crucial in maintaining our competitive edge by providing highquality support and rapid solutions to our trading desks and external clients.
In this role you will gain invaluable experience with modern financial markets stock exchanges organizations and algorithmic high frequency trading technology. You will work on innovative lowlatency technology alongside top specialists in the field developing skills related to system management and monitoring. This position offers a unique opportunity to deeply understand and influence the technologies that drive financial markets today.
Responsibilities:
- Develop and implement monitoring alerting and incident response strategies.
- Automate routine tasks and processes to reduce manual intervention and improve efficiency.
- Collaborate with software engineering teams to design and deploy reliable scalable and efficient systems.
- Deploy production changes with precision ensuring minimal disruption to services and maintaining the integrity of the platform. Rotation based weekend work may be required.
- Manage incident including detailed analysis and reporting to maintain high service levels.
- Patriciate in oncall rotations to provide support for critical systems and services.
Candidate profile:
- IT engineering background (BSc MSc degree in a related field)
- A few years of related work experience
- Preferably knowledge of financial markets and electronic trading
Essential skills:
- Strong knowledge of Unix/Linux systems and networking.
- Proficiency in programming and scripting languages such as Python Go Bash or similar
- Experience with monitoring and observability tools such as ITRS Geneos Dynatrace Prometheus Grafana.
- Strong problemsolving skills and ability to troubleshoot complex systems.
Desirable skills:
- Log management: Splunk ELK Gray log Loki
- Network monitoring: Corvil
- Databases: Oracle PostgreSQL MySQL/Maria DB KDB/q
- Messaging: Tibco Solace IBM MQ LBM Kafka
- Experience with Infrastructure as Code (IaC) tools such as Ansible Terraform or similar.
- Pruriency in programming language such as C C Java
- Prior experience in a highavailability hightraffic environment.
Language Skills: