Design develop and optimize infrastructure and tooling to ensure high scalability reliability and subsecond performance following industry best practices for security and DevOps
Write efficient and maintainable code and scripts to support Infrastructure as Code (IaC) configuration management and automated incident resolution
Enhance and maintain the observability stack ensuring effective monitoring alerting and logging for system performance and issue detection
Participate in oncall rotations and serve as an escalation point for critical service incidents ensuring timely resolution
Develop and maintain comprehensive system documentation including troubleshooting playbooks and operational manuals
Perform additional tasks and responsibilities as needed to support platform stability and efficiency
Qualifications :
Bachelors or higher degree in computer science computer engineering relevant technical field or equivalent practical experience
At least 4 years of experience coding in higherlevel languages such as Python
Advanced experience with architectural solutions and system design
At least 4years of administrative experience with Linux AWS and Kubernetes
Proficiency in analyzing and troubleshooting largescale distributed systems
At least 4years of experience in configuration management using Cloud Formation Terraform and Ansible or similar
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.