This role is a platform operations Sr Engineer:
Primary Responsibilities:
- Automate repeatable and frequent operational processes and procedures
- Monitor Key Performance Indicators (KPIs) for website and applications
- Remediate all Security Vulnerabilities related to the platforms
- Provide Tier 3 Production 24x7 oncall support (rotation)
- Perform platform upgrades using established automated procedures
- Participate in Change Incident and Problem management including Root Cause Analysis
- Improve and document operational and troubleshooting procedures
- Administer Enterprise Tools on Linux and Windows Platforms.
Administration of Splunk and enablement of log parsing and shipping to the centralized service
- Write scripts to automate common system administration tasks and backups.
- Onboard new applications / microservices to our Service Deployment Pipeline that is powered by Jenkins Ansible Docker & Rundeck
- Troubleshoot Docker Builds as it relates to the Service Deployment Pipeline
- Practice and form User Access Control Processes.
- Construct and maintain Environment Management principles.
- Work with Operations and Development to keep dev / test environments in sync and push for evolution to codify the environment configs using Ansible.
- Integration of tooling platforms using Apis configurations etc.
- Manage SSH key shares.
- Provide Stash SVN Jenkins Artifactory Sonar Jira Confluence and Splunk expertise and support.
- Document general DevOps procedures and foster a collaborative culture.
- Produce How To documentation
- Analyze continuous integration needs for applications and generating CI solutions tailored by application.
- Assist development in troubleshooting failed builds. Conducting root cause analysis in close collaboration with the development team.
- Provide technical research on tool sets.
- Assist with business cases for new tools etc. Python Bash RegEx Ansible
Required Qualifications:
- 5 years experience hands on production support on bare metal servers and containers.
- A quick learner and selfstarter eager to learn new technologies with automation mindset.
- 4 years experience with Ansible Bash Python Perl or similar scripting language
- 4 years experience with modern web and mobile application environments (TCP/IP HTTP DNS routing load balancing CDNs WAF security etc)
- 4 years with Docker and Kubernetes.
- Experience with Ansible and Salt Stack or similar configuration management tool
- Experience with Stash/Bitbucket or similar code repository tool
- Experience with web application technologies including HTML JavaScript Node.js Nginx React
- Working knowledge of incident problem and change management best practices Desired
Qualifications:
- Large scale eCommerce industry experience
- Experience with the implementation and management of procedures in a technical operations environment including managing and reporting on quality and operational metrics
- Experience with APIs serviceoriented architecture (SOA) and microservices
- Experience configuring Nagios or similar monitoring tool
- Experience with Datadog or similar Application Performance Monitoring (APM) tool
- Experience with Jenkins Gitlab or similar CI/CD tool
- 4 years Elasticsearch Management: Design deploy and maintain Elasticsearch clusters ensuring optimal performance and scalability.
- Preferred candidate is passionate about the hospitality industry