drjobs Site Reliability Engineer

Site Reliability Engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Singapore - Singapore

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Role: Site Reliability Engineer 12 months Renewable contract

Experience: Minimum of 5 years

Location : Changi Business Park

Summary:

We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to join our growing Observability team. The ideal candidate will have a strong background in building and maintaining robust observability environments including monitoring logging and tracing systems. This role will focus on the design implementation and support of our observability infrastructure ensuring the seamless onboarding of applications and providing critical support during incidents.

Responsibilities:

  • Observability Environment Management: Design build and maintain our observability infrastructure including monitoring tools logging platforms and distributed tracing systems (e.g. Prometheus Grafana Elasticsearch etc.). This includes capacity planning performance tuning and ensuring high availability.
  • Application Onboarding: Work with development teams to onboard applications to our observability platform providing guidance on instrumentation best practices and ensuring data quality. This includes creating and maintaining documentation and training materials.
  • Incident Support: Provide timely and effective support during incidents leveraging observability data to diagnose and resolve issues quickly. This includes contributing to postincident reviews and implementing preventative measures.
  • Automation: Automate repetitive tasks and processes related to observability improving efficiency and reducing manual effort. This may involve scripting developing tools or integrating with CI/CD pipelines.
  • Alerting and Monitoring: Develop and maintain effective alerting strategies ensuring appropriate escalation procedures and minimizing noise. This includes creating dashboards and reports to visualize system health and performance.

Qualifications:

  • Bachelors degree in computer science or a related field or equivalent experience.
  • 5 years of experience as an SRE or in a similar role with a focus on observability.
  • Strong understanding of distributed systems and microservices architectures.
  • Experience with any monitoring logging and tracing tools (e.g. Prometheus Grafana Jaeger Elasticsearch Fluentd Datadog Dynatrace etc.).
  • Proficiency in scripting languages such as Python Go or Bash.
  • Strong problemsolving and analytical skills.
  • Excellent communication and collaboration skills.

Bonus Points:

  • Experience with cloud platforms.
  • Experience with infrastructureascode tools (e.g. Terraform Ansible)

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.