drjobs Full Stack Site Reliability Engineer W2 Position

Full Stack Site Reliability Engineer W2 Position

Employer Active

The job posting is outdated and position may be filled
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Alexander City - USA

Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Job Description

We are looking out for Site Reliability Engineer in Dearborn MI (Hybrid) for 12 months contract please go through below job description and if interested please share your resume to ASAP.

Job Title: Full Stack / Site Reliability Engineer W2 Position

Location: Dearborn MI Hybrid

Duration: 12 Months

Position Description:

We are seeking a talented Full Stack / Site Reliability Engineer to play a key role in developing a comprehensive Internal Developer Platform (IDP) that includes CI/CD pipelines managed infrastructure observability and a developer portal. The primary focus of this role will be on ensuring the stability and scalability of the Internal Developer Platform that hosts the cloud applications that power our customers connected vehicle experiences. The secondary focus of this role will be to facilitate the enablement of our product teams developing and supporting these cloud applications.

Responsibilities:

  • Strong background in software development and systems administration as well as excellent problemsolving and communication skills.
  • Run a production environment by monitoring availability and taking a holistic view of system health.
  • Developing improving and operating the deployment and orchestration of a complex distributed system
  • Improve reliability quality and timetomarket of our suite of software solutions
  • Measure and optimize system performance with an eye toward pushing our capabilities forward getting ahead of customer needs and innovating to continually improve
  • Provide primary operational and engineering Support for multiple large distributed software applications
  • Identify and reduce or eliminate toil via automation to maximize the time spent on engineering and innovation
  • Collaborating with development teams to design build and operate scalable and resilient software systems
  • Automating build deployment monitoring and incident response processes
  • Performing root cause analysis of production incidents and implementing preventive measures
  • Participating in an oncall rotation for incident response and support.
  • Ensuring compliance with security and regulatory standards
  • Conducting performance analysis and optimization of the system

Skills Required:

  • Understanding of gRPC & RESTful APIs and microservices platform

Experience Required:

  • 5 6 years experience with Golang JAVA J2EE NoSQL/SQL Datastore Spring Boot GCP/AWS/Azure Docker/K8 in Maintenance and Development of multitier applications.
  • 4 5 Years of experience with any of APM and other monitoring tools such as Grafana Cloud Dynatrace New Relic ELK Splunk Prometheus Sensu Nagios Kafka DataDog PagerDuty.
  • Strong experience with product & development teams to establish error budgets by identifying the right SLOs (Service level objective) SLIs (Service level indicators) KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime.

Experience Preferred:

  • Regularly review key site technical metrics such as transactions errors logging response times caching strategies conversion/bounce rates capacity & resource utilization.
  • Proactively identify stability risks & work with engineering leadership to establish appropriate mitigation plans
  • Experience in solving complex architecture/design & business problems work to simplify optimize remove bottlenecks etc.
  • Architect design & develop automation to reduce toil improve recoverability availability latency & scalability of supported applications with understanding of MTTD (Mean Time to Detection) & MTTR (Mean Time to Resolution)
  • Maintain knowledge repository that includes Standard operating procedure Release checklists Runbooks for incident recovery

Education Required:

  • 4 Year College Degree in Computer Science or Equivalent Experience

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.