Who are we:
Fulcrum Digital is an agile and nextgeneration digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries including banking & financial services insurance retail higher education food health care and manufacturing.
The Role:
- Plan manage and oversee all aspects of a Production Environment Java J2EE Spring Boot applications.
- Define strategies for Application Performance Monitoring Optimization in Prod environment.
- Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
- Ensures that batch production scheduling and process are accurate and timely.
- Able to create and execute queries to big data platforms and relational data tables to identify process issues or to perform mass updates preferred.
- Performs ad hoc requests from users such as data research file manipulation/transfer research of process issues etc.
- Take a holistic approach to problem solving by connecting the dots during a production event through the various technology stack that makes up the platform to optimize meantime to recover.
- Engage in and improve the whole lifecycle of services from inception and design through deployment operation and refinement.
- Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
- Support services before they go live through activities such as system design consulting capacity planning and launch reviews.
- Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating and lead in DevOps automation and best practices.
- Maintain services once they are live by measuring and monitoring availability latency and overall system health.
- Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
- Work with a global team spread across tech hubs in multiple geographies and time zones.
- Ability to share knowledge and explain processes and procedures to others.
Requirements
Skills:
Must Have:
- Linux
- Kubernetes
- ITIL / ITSM
- Application Troubleshooting
- Any Monitoring tool (Preferred Splunk/Dynatrace)
- Jenkins CI/CD
Good To Have:
- Even Framework architecture
- Git basic/bit bucket
- Ansible/Chef Basic
- Shell Scripting Basic
- SQL
- Groovy Scripting/Yaml
Benefits
System reliability engineer