drjobs Director - DevopsSRE

Director - DevopsSRE

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Noida - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Job Title: Director SRE DevOps Monitoring and Database Operations Key Responsibilities Leadership & Strategy: Provide technical and people leadership to SRE DevOps Monitoring and Database Operations teams. Collaborate with leadership on budgeting planning hiring and managing thirdparty contracts. Oversee project status assemble project teams and define assignments with schedules and milestones. Platform Reliability & Performance: Drive continuous improvement of reliability stability and performance of digital platforms. Oversee implementation of automated telemetry observability and applied intelligence systems. Lead efforts to develop automated alerting selfhealing mechanisms and intelligent response systems. Incident & Escalation Management: Ensure 24/7 uptime of sites and services with minimal unplanned downtime. Serve as Escalation Manager/Critical Incident Manager during major incidents leading teams in rapid service restoration. Provide oncall escalation support based on 24/7/365 schedules. Communicate timely updates and incident reports to senior leadership. Collaboration & Integration: Partner with administrators platform engineers and other stakeholders to achieve highly reliable infrastructure systems and integrations. Collaborate with product application development QA and technology teams to enhance service reliability and performance. Incident Management & Automation: Provide advanced Incident and Problem Management support to effectively diagnose remediate and resolve platform issues. Automate critical workflows across the platform to minimize manual errors and reduce human intervention. Implement ITIL processes like Incident Problem and Change Management. Monitoring & Scalability: Design and implement effective monitoring systems with proper alerting and escalation mechanisms for critical events. Ensure timely capacity planning and infrastructure upgrades for optimal reliability. Develop and refine processes to minimize Mean Time to Recover (MTTR) and extend Mean Time to Failure (MTTF). Documentation & Compliance: Create and maintain detailed documentation including run books incident response guides postmortem reports RCAs and mitigation plans. Ensure all changes adhere to established procedures and documentation standards. Business Alignment: Understand business workflows and map technology solutions to address problems effectively. Lead conversations and provide technical support to both internal and external customers.

Employment Type

Full Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.