Job Title: SRE Lead
Location: San Leandro CA USA ( Onsite)
Duration: Long Term Contract
Job Description:
- 10 years of Software Engineering experience or equivalent demonstrated through one or a combination of the following: work experience training military experience education
- 10 years of experience in Production support/Site Reliability Engineering teams with continued focus on improving Platform health
- Familiar with Agile or other rapid application development practices
- Handson expertise with Automated testing Process Automation & building dashboards using APM tools.
- Experience with distributed (multitiered) systems algorithms relational databases and NoSQL databases.
- Knowledge & Exposure caching tools (Redis memcache) or messaging tools such as MQ Kafka.
- Must have working knowledge of APM tools such as splunk GCL ELK Grafana Prometheus etc.
- Able to create Dashboards using GCL/Splunk/ELK and setup alerts.
- Working knowledge of CICD is a plus Source control like Git Continuous Integration Jenkins / UCD Release etc. .
- Ability to work with Engineering teams across the ecosystem such as Security Networking & Infrastructure challenges which can impact platform health & resiliency.
- Shell Scripting / DevOps tools like Ansible with good knowledge of yaml file to write playbooks .
- Experience with distributed storage technologies like NFS as well as dynamic resource management frameworks PCF Kubernetes / OpenShift AWS or Azure.
- Tech Stack: Java/J2EE (Spring Spring Boot Python Shell Scripting Kafka Oracle MongoDB etc.).
- A proactive approach to spotting problems areas for improvement and performance bottlenecks.
- Bachelors Degree in computer science computer science engineering or related experience required.