At IFS R&D our DevOps Engineers play a pivotal role in ensuring that our software solutions are not only functional but also reliable and efficient in production. This role goes beyond traditional DevOps by embedding Chaos Engineering into our practices to proactively identify and address system vulnerabilities thus ensuring resiliency.
As a Senior DevOps Engineer you will take the lead in designing developing deploying and maintaining robust chaos engineering process automation and tooling solutions. Your work will focus on driving efficiency through automation ensuring seamless integration of continuous integration testing and deployment processes and fortifying the resilience of our operational platforms.
Duties:
- Collaborate with the Quality Assurance & Software Engineering teams to design and implement chaos experiments fault scenarios and resiliency probes.
- Set up and maintain a chaos engineering platform using Litmus integrated with observability tools and load generation systems.
- Execute chaos engineering experiments within CI/CD pipelines to uncover system weaknesses and enhance fault tolerance.
- Identify and implement opportunities for automation and process improvements to optimize team workflows and system reliability.
- Work crossfunctionally to incorporate industryleading DevOps and Chaos Engineering practices into team operations.
- Create and maintain comprehensive documentation including processes tools and training materials to support system reliability and automation initiatives.
- Proactively monitor infrastructure applications and databases to identify and resolve issues before they affect production environments.
- Promote a culture of continuous learning by sharing insights mentoring colleagues and contributing to market research and competitive analysis efforts.
Qualifications :
- University or equivalent qualification in Computer Science Software Engineering Information Technology.
- Proven ability in a similar role with handson exposure to Chaos Engineering concepts and practices.
- Proficiency in working with cloudbased multitenant infrastructure (Azure MongoDB Kafka etc.).
- Expertise with Infrastructure as Code (IAC) tools such as Terraform or Pulumi.
- Exposure to GitOps with Argo CD.
- Proven ability with Chaos Engineering tools like Litmus Gremlin or Chaos Monkey.
- Expertise in containerization and orchestration (e.g. Docker Kubernetes).
- Strong scripting skills in Bash PowerShell Python Ansible or similar languages.
- Operational expertise with Linux systems and cloud monitoring tools.
- Familiarity with Git Bitbucket and/or Azure Pipelines.
- Exposure to monitoring/observability tools and strategies for cloud infrastructure applications and databases.
- Quality assurance proficiency including testing automation and test data management.
- A passion for learning new technologies solving challenges and enhancing process reliability.
Remote Work :
No
Employment Type :
Fulltime