Role: Senior System Reliability Engineer
Location: St Louis MO (Onsite)
Duration: Long Term
Job Description:
Provide L2 support to production system like application database middleware components infrastructure and network components
Manage productions incidents endtoend within defined SLAs with focus on resolution rather than who caused it.
Interact with various stake holders such as Release managers program leads service managers development and test leads
Support the DevOps team in testing the promote pipelines and suggest automation of configuration items.
Practice incident management best practices and perform RCA.
Participate in disaster recovery tests and operational acceptance tests
Analyse the technology stack that makes up the product and optimize recovery time objective.
Work with team members spread across and time zones
Share knowledge document improvements and mentor junior resources
Requirements:
Responsibility Matrix
Deployments MTF/Prod
Maintenance items (including stop/start Disaster Recoveryrelated activities etc.)
Monitoring
Support TRTs
Incident creation
CR for changes in MTF/Prod
Tools:
Log Monitoring Tool Splunk
Application Monitoring tool Dynatrace
Ticketing incident/problem management tool Remedy
Linux
SQL
Devops Basics CICD Basics Overview of git Bit bucket SonarQube Fortify CI(Jenkins) ARA Saltstack Chef Artifactory MC DevOps Tool chain