About the Role
We are seeking a highly motivated and experienced DevOps Engineer to join our growing team. In this role you will be responsible for the design implementation and maintenance of our cloud infrastructure ensuring high availability scalability and security. You will be working closely with our engineering team to automate deployments manage infrastructure as code and troubleshoot production issues.
This is a unique opportunity to work on cuttingedge technologies and contribute to the success of a rapidly growing company. We offer a fastpaced and collaborative work environment where you will have the opportunity to learn and grow your skills.
Responsibilities
* Design implement and maintain our cloud infrastructure on AWS.
* Automate infrastructure provisioning configuration management and application deployments using tools like Terraform.
* Implement and manage monitoring and logging solutions using Prometheus Grafana and other relevant tools.
* Develop and maintain internal tooling and scripts to improve operational efficiency.
* Troubleshoot and resolve production issues related to infrastructure applications and performance.
* Collaborate with engineering teams to implement and maintain CI/CD pipelines.
* Participate in oncall rotation to ensure 24/7 availability of critical services.
* Stay uptodate on the latest technologies and trends in cloud computing and DevOps.
Qualifications
* 3 years of experience in a DevOps or SRE role with a strong understanding of cloud infrastructure and operations.
* Extensive experience with Kubernetes including cluster administration deployment strategies and troubleshooting.
* Experience with OpenStack is highly desirable but not required.
* Proficiency in infrastructureascode tools like Terraform or Ansible.
* Strong scripting skills in Python or similar languages.
* Strong programming skills in Golang or similar languages.
* Strong configuration management skills with Salt Chef or similar languages.
* Experience with Observability tools like Prometheus Cortex Grafana and Loki.
* Experience with CI/CD tools and best practices.
* Experience with administrating and debugging on Linuxbased operating systems.
* Excellent problemsolving and troubleshooting skills.
* Strong communication and collaboration skills.
* Strong incident management experience.
Candidates need to have necessary authorisation to work in the US
Bonus Points
* Experience with EKS (Elastic Kubernetes Service).
* Experience with Cluster API Cluster API Provider for AWS or Kamaji.
* Experience with managing onpremise infrastructure.
* Familiarity with OpenTelemetry and AIpowered observability tools.
* Experience working in a fastpaced startup environment.
cloud infrastructure,aws,salt,golang,cloud,linux,openstack,ci/cd,devops,grafana,prometheus,python,terraform,chef,kubernetes