drjobs Site Reliability Engineer العربية

Site Reliability Engineer

Employer Active

The job posting is outdated and position may be filled
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

others - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Job Description

Technical/Functional Skills

  • Experience in implementing SRE solutions in areas of monitoring, resiliency, incident management and automation.
  • Experience resolving issues in areas of user application, cloud platform, system uptimes, system recovery, performance, etc
  • Strong hands-on experience developing applications using Java, NodeJS / AngularJS, etc.
  • Deep understanding and experience of microservices, API and Web Services.
  • Strong scripting experience to automate and reduce toil with Bash , Python, GO etc. in Java/NodeJS runtime environment.
  • Exposure to monitoring tools in setting up RUM, APM, Synthetic, Infrastructure, and alerting.
  • Hands on experience in setting up dashboards and providing analytics using any Cloud platforms like Azure/AWS, etc.
  • Strong hands on experience working with any Real User Monitoring (RUM) tools like Dynatrace, New Relic, AppDynamics, etc.
  • Experience working with any Synthetic Monitoring tools like Dynatrace, Blue Triangle, Sematext etc
  • Dashboarding experience working with any log monitoring tools like Splunk, Solarwinds, etc.
  • Experience with handling infrastructure issues in cloud native applications using docker, Kubernetes, etc.
  • Experience supporting Healthcare Domains.
  • Experience with CICD pipeline using Jenkins and Github.
  • Excellent verbal and written communication skills.
  • Certifications in any Monitoring tools and Cloud platforms are preferred.

Roles & Responsibilities

  • Responsible for toil reduction in processes involving Testing, Release Management, Change management and Incident management.
  • Hands-on experience with identifying required dashboards for Java and NodeJS services.
  • Familiar with resolving java script and user application issues in adobe experience manager application.
  • Hands-on experience with setting up dashboards using JSON scripting
  • Create rules to optimize incident response by metrics, streamlining alert flows, and collaboration and communication across squads/ dev teams
  • Hands-on experience in scripting for setting up log monitoring dashboards.
  • Proactively identify the issues that might disrupt the service in production.
  • Has experience leading P1 priority production issues as an SRE Lead/Manager.
  • Hands on experience in leading Root Cause Analysis (RCA) calls with stakeholders.
  • Hands on experience with setting up monitoring threshold metrics and alerts in any cloud tools (Azure or AWS). Any certifications are preferred.
  • Hands on experience with SSH Monitoring of pods and containers in Kubernetes.
  • Partner with development teams to improve services through rigorous testing and release procedures.
  • Balance feature development speed and reliability with well-defined service level objective (SLO, SLI)
  • Define and monitor service level metrics that include incident management KPIs like: MTTD, MTTR, MTBF, MTTF, Unavailability rate, Incident count, etc.
  • Debug production issues across services and API endpoints.
  • Building reports using analytics tools like Adobe Analytics or Cloud analytics.

Employment Type

Full Time

Company Industry

About Company

100 employees
Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.