SRE Manager Jobs in Entain in Any - India

SRE Manager

Entain

Posted on : 20-12-2024

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

India

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 20-12-2024

Job Description

As a SRE Manager you will focus on ensuring the reliability performance and scalability of services and infrastructure.

Reporting to the Head of Engineering you will be part of the Product & Technology team will actively participate in all aspects of Site Reliability Engineering including technical vision telemetry and observation decisions automation strategy solution delivery and platform incident and problem management. This is a leadership role with both technical and people leadership responsibilities. As such this role participates in short and longterm systems planning teams and organizational planning. This position reports directly to the Director of Engineering.

What you will do

Provide technical and people leadership to the SRE teams by facilitating oneoneone team and performance review meetings
Fulfil the role of Escalation Manager/Critical Incident Manager on critical/ major incidents by facilitating quick and effective incident resolution to minimize player and business impact.
Conduct RCA and PostIncident Reviews (PIRs) in a Blameless manner to identify root causes and prevent recurrence.
Build advanced Incident Management and Problem Management support (SOPs and runbooks) to effectively identify remediate and resolve issues related to platform reliability stability and performance through careful analysis of telemetry data and system logs.
Continuously work to improve problem identification and service restoration of platforms by leading and overseeing efforts to define enhance and deliver automated alerting and response systems with intelligent selfhealing capabilities
Collaborate with platform engineers through implementation decisions to achieve highly reliable infrastructure systems and integrations (develop synthetic monitoring health dashboards reliable alerts and system performance).
Promote automation (CI/CD) infrastructureascode (IAC) practices develop tools and process for seamless deployments rollbacks monitoring and troubleshooting.
Define and ensure proper reviews are built to minimise the Mean Time to Recover/ Discover (MTTR/ MTTD) and Mean Time to Failure (MTTF).
Works with development teams to set error budgets SLIs/ SLOs and policies. Works with SRE to implement alerts and policies to minimize the impact failures and outages have on players.

Qualifications :

Graduate or PostGraduate with strong engineering background.
10 years of experience working in global organizations with the ability to effectively communicate with executives leaders and individual contributors across the organization.
5 years of SRE experience working with telemetry observation selfhealing solutions and platform automation.
Proficient in analysing complex technical issues identifying root causes and implementing effective solutions under pressure.
Experience with monitoring logging & telemetry tools like New Relic Splunk ELK Nagios Prometheus AWS CloudWatch Datadog etc.
Experience in Disaster Recovery Chaos Engineering with tools like Chaos Mesh and Chaos Monkey and periodically testing resiliency and failovers.
Handon experience in the monitoring of Exposure with automation and tools such as (but not limited to) GitlabCI Jenkins Terraform Ansible etc.
Expert in designing creating and supporting Automation (PowerShell Python Ruby AWK SED etc.) to run healthchecks and selfhealing capabilities for the platforms.
Experience with Networking Content Delivery Networks (CDN e.g. Akamai Cloudflare) streaming platform technologies like Apache Kafka and Databases: (Oracle MS SQL etc.)
Experience with Cloud platforms esp. Amazon Web Services (AWS)
Application Security the practice of safeguarding application through access control Authn & Authz data encryption secure communication using TLS/SSL and MTLS.
Collaboration & Change Management tools: Jira ServiceNow SharePoint etc.
Experience in managing relationships with thirdparty vendors and service providers contributing to the business.

Remote Work :

Employment Type :

Fulltime

Employment Type

Full-time

Company Industry

Key Skills

Restaurant Experience
Customer Service
Employee Evaluation
Management Experience
Math
Employment & Labor Law
Sanitation
Leadership Experience
P&L Management
Mentoring
Supervising Experience
Restaurant Management

Apply Now

About Company

Entain

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Free AI Resume Review

Get Hired 3x Faster with free, confidential review from Ai resume review service.

Order Now

Resume, LinkedIn, Cover Letter

Elevate your professional profile with expertly crafted documents including your resume, LinkedIn profile, cover letter.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Learn More

Reverse Recruiting

Never apply for a job again. We apply and track jobs for you to find your perfect match.

SRE Manager

Entain

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Agile Product Development Manager

Manager - HR Tech Platform Delivery

Manager - Costing Budgeting Avenue 11

SRE Manager

Manager Accounts

Relationship Manager

HR Manager

Production Manager