Job Description:
The Public Cloud Support Engineering team is seeking an experienced Senior Cloud Operations Expert to provide advanced support resolution for complex issues related to MultiCloud (AWS/GCP/KUBERNETES/OpenShift/Terraform/Networking) cloud infrastructure and technical assistance for a team of 1520 support engineers.
Ops expert will be responsible for managing and ensuring the efficient operation of our public cloud infrastructure. This role requires a deep understanding of (AWS/GCP/Kubernetes/Terraform/Networking/OpenShift) and custom developed automation frameworks excellent leadership abilities and the capacity to make strategic operational decisions.
You will work closely with the support team developers and operations teams to ensure the smooth operation of cloud services and resolve critical incidents.
A suitable candidate would be extremely customer focused who could multitask and utilize both written & verbal communication skills to help our diverse range of customers resolve their complex technical issues.
Responsibilities:
- Provide Operations support for MultiCloud (AWS/GCP) cloud infrastructure and applications including troubleshooting and resolving complex issues.
- Should be technical analytical and have experience driving technical teams. In addition this person will have a record of driving projects to improve supportrelated processes and technical support experience and be passionate about the growth and success of HV.
- Collaborate with crossfunctional teams including developers and operations teams to diagnose and resolve critical incidents.
- Monitor and maintain the performance reliability and security of MultiCloud (AWS/GCP/KUBERNETES/OpenShift/Terraform/Networking) cloud) cloud services.
- Deploy configure and maintain cloudbased solutions using MultiCloud (AWS/GCP/KUBERNETES/OpenShift/Terraform/Networking) cloud) services ensuring adherence to best practices and security guidelines.
- Conduct root cause analysis for major incidents and implement preventive measures.
- Participate in oncall rotations and respond to critical incidents promptly.
- Continuously improve and automate operational processes and workflows to enhance efficiency and scalability.
- Collaborate with MultiCloud ( AWS / GCP / KUBERNETES / OpenShift / Terraform/Networking) support and engineering teams to escalate and resolve complex issues.
- Document processes procedures and troubleshooting steps for future reference.
Skills and Qualifications:
- Bachelors degree in computer science engineering or a related field.
- 7 years experience working on large enterprise technical customer escalations in highpaced operations environment with 5 years as an Operations expert with a track record of success.
- At least 5 years of experience in 24 x 7 AWS/GCP Cloud Production Support roles.
- At least 3 years of experience with Linux/Unix systems administration.
- At least 2 years of experience with Dockers/Kubernetes and OpenShift orchestration
- Strong experience in supporting and troubleshooting MultiCloud (AWS/GCP/KUBERNETES/OpenShift/Terraform/Networking) cloud infrastructure and services.
- Indepth knowledge of MultiCloud (AWS/GCP/KUBERNETES/OpenShift/Terraform/Networking) cloud) services including S3 VPC IAM Terraform EKS CloudWatch MSK OpenSearch Direct Connect Transit Gateway DynamoDB
- Proficiency in scripting languages such as Python Bash or PowerShell for automation and troubleshooting.
- Knowledge of Kibana/Grafana/Prometheus alerting and monitoring tools.
- Experience with DevOps practices and tools like Git/Bitbucket Jenkins Terraform
Certifications:
- AWS Certified SysOps Administrator/ AWS Solutions Architect Mandatory
terraform,powershell,networking,jenkins,dockers/kubernetes and openshift orchestration,linux/unix systems administration,kibana/grafana/prometheus alerting and monitoring tools,devops practices,bash,openshift,python,git/bitbucket,kubernetes,aws,gcp