Please share suitable resumes for the below requirement to
Job Description:
Title: Observability Software Engineer
Duration: 6 Months
Location: St. Louis MA (Day 1 onsite)
Rate: $55/hr on C2C
Client: HCL
Note: Ex Mastercard is mandatory
Summary:
As an Observability Software Engineer II you will be responsible for designing implementing and maintaining our observability platform. Youll work closely with crossfunctional teams to ensure our systems are transparent measurable and reliable. By leveraging your expertise in observability tools and techniques you will help us gain deep insights into our applications infrastructure and user experiences.
Responsibilities:
Design and develop robust observability solutions to monitor analyze and troubleshoot distributed systems.
Familiar with OTEL standards and tools.
Previous experience working with application teams to implement selfhealing i.e. alerting that triggers automated remediation.
Implement and configure monitoring logging tracing and alerting systems to ensure comprehensive coverage of our infrastructure and applications.
Collaborate with software engineers to instrument code for telemetry data collection and analysis.
Optimize observability tooling and processes to improve system reliability performance and scalability.
Create dashboards reports and visualizations to provide actionable insights into system health and performance.
Investigate and resolve incidents by analyzing telemetry data and identifying root causes.
Stay current with industry trends and best practices in observability and recommend improvements to our observability strategy and infrastructure.
Qualifications:
Bachelors degree in Computer Science Engineering or a related field (or equivalent experience).
12 years experience as an Observability Engineer or a similar role in a production environment.
Deep understanding of observability principles methodologies and tools such as Prometheus Grafana Jaeger ELK stack etc.
Proficiency in programming/scripting languages like Java Python Go or similar for automation and tooling development.
Strong knowledge of cloud computing platforms (AWS preferred) and container orchestration systems (e.g. Kubernetes).
Excellent problemsolving skills and the ability to troubleshoot complex issues in distributed systems.
Strong communication skills and the ability to collaborate effectively with crossfunctional teams.