Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailResponsibilities:
Ability to create an SRE backlog broad things that need to be done right to implement SRE at scale
Work with senior stakeholders to agree and drive the backlog
Liaise and manage dependencies across teams which implementing the backlog; while staying on course and highlighting any derailment early on
Incorporate various software engineering aspects to develop and implement services that improve IT and support teams ranging from production code changes to alerting and monitoring adjustments.
CI/CD Pipeline Development and optimization
Building proprietary tools from the scratch to mitigate weaknesses in incident management or software delivery.
Troubleshooting Support Escalation routing escalations to concerned teams.
OnCall Process Optimization via automation
Documenting Knowledge
Optimizing SDLC
Skills:
Deep working knowledge of monitoring and log analytics tools like Dynatrace Splunk Grafana CloudWatch XRay
Coding should be proficient in scripting languages like Python Ruby for automation tool development.
Mastery of cloud infrastructure particularly AWS services (EC2 S3 RDS VPC Lambda CouldFormation AWS CLI)
CI/CD Expertise setting up and maintaining CI/CD pipelines to automate testing and deployment processes. Skills in using tools like Jenkins AWS CodeBuild AWS CodeDeploy and AWS CodePipeline are essential.
Communication communicate with different teams to report and address incidents explain technical concepts negotiate reliability standards and manage team relationships. They must interact with software engineers product teams managers CEOs CTOs
Remote Work :
No
Full Time