drjobs distributed computing engineer

distributed computing engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Philadelphia, PA - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Location: Philadelphia pa MUST be onsite within 40 miles and 60 minutes of drive

Duration: 1 year

Job Summary

We are looking for a highly skilled and handson Distributed Computing Engineer to help build a platform for executing largescale AI workloads across a fleet of distributed processors. This role involves designing and implementing systems for distributing tasks managing state and aggregating results efficiently. The ideal candidate will have strong engineering expertise in building distributed systems and a deep understanding of the design principles behind orchestration frameworks like Kuberneteswithout being limited to specific tools. Youll work at the intersection of cuttingedge AI and distributed computing helping shape the platform that powers nextgeneration AI workloads.

Key Responsibilities

  • Build and optimize a distributed computing platform for executing AI workloads across many nodes ensuring scalability reliability and performance.
  • Implement systems for job scheduling task distribution state tracking and faulttolerant execution.
  • Apply distributed systems concepts such as partitioning replication consensus and eventual consistency to ensure robust system behavior.
  • Design solutions inspired by modern orchestration frameworks (e.g. Kubernetes) while tailoring them to meet the unique requirements of AI workload distribution.
  • Write clean efficient code in languages like Python Go or C to build highperformance distributed components.
  • Collaborate with other engineers to integrate distributed computing capabilities into larger AI pipelines.
  • Contribute to system monitoring and debugging tools to ensure realtime visibility into system health and performance.
  • Stay current with advancements in distributed systems orchestration techniques and AI model execution to bring innovative ideas to the team.

Qualifications

  • Bachelors or Masters degree in Computer Science Software Engineering or a related field (or equivalent experience).
  • Strong experience in building distributed systems with a focus on scalability fault tolerance and performance optimization.
  • Handson experience with concepts like task scheduling state management and distributed coordination protocols (e.g. leader election or consensus).
  • Familiarity with the design principles behind container orchestration frameworks (e.g. Kubernetes) including declarative configuration automated scaling and service discovery.
  • Proficiency in one or more programming languages such as Python Go or C for building highperformance systems.
  • Experience working with networking concepts (e.g. RPCs gRPC) and designing communication between distributed components.
  • AI/ML workflows or largescale data processing required.

Preferred Skills

  • Experience deploying distributed systems on cloud platforms (AWS GCP Azure).
  • Knowledge of message queues (e.g. Kafka) or eventdriven architectures for task distribution.
  • Familiarity with debugging distributed systems using logging/observability tools like Prometheus or Grafana.

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.