drjobs site Reliability Engineer

site Reliability Engineer

Employer Active

drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Alexander City - USA

Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Job Description

Role: Site Reliability Engineer

Location: Boston MA

Duration: Long term contract

As a Site Reliability Engineer you will be responsible for conducting Root Cause Analysis meetings fostering a

blamefree environment to ensure comprehensive information about events and their resolutions is gathered

effectively. This role requires the ability to navigate complex technical issues while promoting open and transparent

discussions among team members. You will utilize trends and metrics to identify improvement opportunities within

existing frameworks tools and processes to improve systems continuously.

Responsibilities:

  • You will be part of the SRE team who are focused on Root Cause Analysis of critical production outages to improve resiliency.
  • Lead problem tickets and improvements to major software components systems and features to improve the
  • availability scalability latency and efficiency of client system.
  • Engage in and improve the service lifecycle from inception and design to deployment operation and refinement based on lessons learned through deep dives.
  • SSSREWORDDOCUMENTTEMPLATE Handson troubleshooting VMware Kubernetes System Software functionality performance and configuration issues.
  • Be a trusted technical advisor who leads complex root cause analysis investigations from beginning to end until improvement implementation.
  • Demonstrate sound knowledge of gathering logs and facilitating the root cause analysis with crossfunctional teams.
  • Assist internal teams with corrective actions and improvement tickets and influence the completion goals.
  • Flexibility to work during occasional out of hours including weekend may be required depending on the criticality and workload demands.

Qualifications:

  • Bachelors degree in software engineering Information systems computer science or a related field.
  • 10 years of experience working on ITSM tools such as Jira ServiceNow etc.
  • 8 years of infrastructure engineering experience with a record demonstrating handson troubleshooting in largescale solutions onprem distributed systems and customdeveloped software applications.
  • 8 years of experience in operating production systems including troubleshooting testing and automation.
  • 5 years of experience leading technical Root Cause Analysis (Software focus is a plus).
  • Team player with excellent communication skills and the ability to prioritize multiple tasks.
  • Experience with executive communication report writing and presentation skills to nontechnical audiences.
  • Strong technical background in container technologies such as Kubernetes detaildriven and excellent problemsolving abilities.
  • Experience in the advanced use of tools like Prometheus Grafana Logic Monitor Elastic and PowerBi is a plus.

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.