Cloud Infra Control Plane Service Engineering Architect Jobs in Servsys Corporation in Any - USA

Cloud Infra Control Plane Service Engineering Architect

Servsys Corporation

Posted on : 31-12-2024

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Jobs by Experience

5years

Job Location

USA

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 31-12-2024

Job Description

This is a remote position.

Job Title: Cloud Infra Control Plane Service Engineering Architect

Location: Remote work candidates in the Bay Area or Seattle will be prioritized. All candidates should expect to work 3pm pst thru 9pm pst at least 2 days a week

Duration:6 months

the project is all around implementing an Nvidia SuperPod. Major major bonus points for candidates who have that experience.

Key Responsibilities:

Roles and Responsibilities:

Infrastructure Management:

Manage and monitor computer clusters ensuring high availability and performance.
Implement and maintain automation scripts for infrastructure provisioning and management. Design and Implementation:
Design implement and maintain computer services for both GPU and nonGPU environments.
Develop and optimize algorithms for highperformance computing tasks especially in the AI/ML Training and Inference domain. Performance Optimization:
Analyze and optimize the performance of compute workloads.
Implement best practices for resource utilization and efficiency. Collaboration:
Work closely with data scientists researchers and other engineering teams to understand and meet their compute requirements.
Collaborate with hardware vendors to evaluate and integrate new technologies. Security and Compliance:
Ensure that compute services comply with security policies and industry standards.
Implement and maintain security measures to protect data and compute resources. Troubleshooting and Support:
Provide support for computerelated issues including debugging and resolving hardware and software problems.
Develop and maintain documentation for troubleshooting procedures and best practices. Continuous Improvement:
Stay updated with the latest advancements in compute technologies and integrate them into the infrastructure.
Continuously improve the reliability scalability and performance of compute services. Qualifications:

Education:

Bachelors or Masters degree in Computer Science Engineering or a related field.
NVIDIA and AI Certification Experience:
Years of experience managing onpremise GPU or non GPU systems
Proven experience in managing and optimizing GPU and nonGPU computer environments.
AI Infra Engineering building and operating skills
Experience with highperformance computing (HPC) and parallel processing including Baremetel large scale virtual environments.
Implement virtualization architectures leveraging expertise with Kubernetes distributions like OpenShift or Rancher and cloud technologies on bare metal environments.
Proficiency in hardware technologies such as SRIOV DPU and GPU with proven experience in implementing these technologies in virtualized and containerized environments. Technical Skills:
Proficiency in programming languages such as Python C or similar.
Experience with infrastructure as code (IaC) tools like Terraform Ansible or similar.
Familiarity with containerization and orchestration tools like Docker and Kubernetes.
Familiarity with Kubernetes underlying technologies with CRI CSI CNI Operators GPU device plugin RMDA/InfiniBand integration
Knowledge of cloud platforms (AWS Azure GCP) and their compute services. Soft Skills:
Strong problemsolving skills and attention to detail.
Excellent communication and collaboration skills.
Ability to work in a fastpaced dynamic environment.

Bachelor's or Master's degree in Computer Science, Engineering, or a related field. NVIDIA and AI Certification Experience: Years of experience managing on-premise GPU or non GPU systems Proven experience in managing and optimizing GPU and non-GPU computer environments. AI Infra Engineering building and operating skills Experience with high-performance computing (HPC) and parallel processing including Baremetel, large scale virtual environments. Implement virtualization architectures, leveraging expertise with Kubernetes distributions like OpenShift or Rancher, and cloud technologies on bare metal environments. Proficiency in hardware technologies such as SR-IOV, DPU, and GPU, with proven experience in implementing these technologies in virtualized and containerized environments. Technical Skills: Proficiency in programming languages such as Python, C++, or similar. Experience with infrastructure as code (IaC) tools like Terraform, Ansible, or similar. Familiarity with containerization and orchestration tools like Docker and Kubernetes. Familiarity with Kubernetes underlying technologies with CRI, CSI, CNI, Operators, GPU device plugin, RMDA/InfiniBand integration Knowledge of cloud platforms (AWS, Azure, GCP) and their compute services. Soft Skills:

Employment Type

Remote

Company Industry

Key Skills

React Native
AI
Enterprise Software
React
Node.js
Redis
AWS
Software Development
IOS
Team Management
Product Development
Mobile Applications

Apply Now

About Company

Servsys Corporation

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Free AI Resume Review

Get Hired 3x Faster with free, confidential review from Ai resume review service.

Order Now

Resume, LinkedIn, Cover Letter

Elevate your professional profile with expertly crafted documents including your resume, LinkedIn profile, cover letter.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Learn More

Reverse Recruiting

Never apply for a job again. We apply and track jobs for you to find your perfect match.

Cloud Infra Control Plane Service Engineering Architect

Servsys Corporation

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Cloud Architect

Cloud Solutions Architect

Cloud Engineer

Software Architect

Director of Electrical Engineering

Solution Architect-ONSITE

Customer Service

Cloud Automation and AWS DevOps Engineer