Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailJob Title: DevOps SRE engineer
Location: NYC, NY
Contract: 6+ months
Description:
Bachelor s degree or equivalent in Computer Science, Engineering or a related field, or comparable experience Proven experience in IT, application development or DevOps, SRE (Site Reliability Engineering) including excellent knowledge of networking, computing, and storage Background in Software Development, Software Validation, or Systems Engineering Industry certification in SRE, Cloud-Native, Cloud services / Solutions preferred Developing logging, monitoring and Observability toolset and dashboards Must have hands on experience with Loki, Jaeger or similar tools (cloud-native landscape). Strong Preference for experience with Prometheus and Grafana Preference for some experience with Splunk and App-Dynamics Experience with Other Cloud-Native observability tools as well as Open Telemetry, Fluent Bit, Pushgateway, Alertmanager, Cortex etc. would be an add on Strong experience with scripting language e.g. python, go, shell script Required Qualification: Strong experience on installing and configuring Prometheus and Grafana at enterprise Experience setting MTLS between Prometheus and Grafana Develop Observability and monitoring for application and infrastructure including analytics, Dashboards, Visualization for SRE and Chaos Engineering. Interface with application and infrastructure teams to develop & configure appropriate metrics, logging, tracing and related diagnostics and monitoring capabilities. Work with Performance Eng./Chaos Eng./SRE/DevOps and Run-Ops teams to implement the required observability toolset and configurations Design and implement reports and analytics for application, platforms and infrastructure observability. Provide thought leadership and training to the teams, in the area of observability and use of observability for performance/chaos/site reliability/platform engineering. Hands on experience with Observability analytics, Log aggregation, management and analytics and dashboard configuration Experience with related cloud technologies associated with AWS, GCP or Azure platform (example: AWS CloudWatch and X-Ray, Azure Monitor, GCP cloud operations/stack driver, etc.). Experience with IAC tools (incl. Terraform, CloudFormation, ARM, Bicep), scripting languages and Ansible
Full Time