Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailCloud Operations/System Administrator typically has a wide range of responsibilities which can include:
Support application development teams: Implement service requests from application development teams to support them on all activities related the deployment and environment configuration and implement best practices for deployment monitoring and logging of legacy applications
Monitoring and Incident Response: Monitor and maintain the availability and performance of applications hosted on IIS servers running on Windows virtual machines using Splunk (or similar technologies) for logging application dashboarding and alerting
Performance Optimization: Identify bottlenecks and optimizing workloads for efficiency
Backup and Disaster Recovery: Setting up backup strategies and disaster recovery plans.
Patch Management: Perform regular maintenance tasks such as patching upgrades and capacity planning to ensure the stability and security of the infrastructure.
Capacity Planning: Estimating future resource needs and planning for scalability.
Resource Management and automation: Develop and maintain automation scripts and tools to streamline operational processes and improve efficiency.
Automation: Develop and maintain automation scripts and tools to streamline operations such as automated deployment scaling and monitoring.
Security: Implement and manage security measures to protect cloud environments including patch management compliance and vulnerability assessments.
Expertise in cloud platforms Azure is preferable others are a plus.
Proficiency in containerization technologies such as Docker and container orchestration platforms like Kubernetes.
Familiarity with infrastructure as code (IaC) using Terraform.
Demonstrated experience with CI/CD pipelines and automation tools such as Jenkins GitLab CI/CD CircleCI or Azure DevOps o Indepth knowledge of monitoring and observability tools such as Prometheus Grafana ELK stack or Splunk.
Experience with Elasticsearch index management and troubleshooting. o Experience managing Kafka message broker systems including configuration optimization and monitoring.
Experience scripting and programming skills with proficiency in languages such as Python Shell Java or .Net.
Troubleshoot issues with developers and provide guidance on infrastructurerelated matters.
Strong problemsolving and troubleshooting skills with the ability to diagnose and resolve complex issues in a production environment.
Excellent communication and collaboration skills with the ability to work effectively in a cross functional team environment. o Implementing High Availability (HA) and Disaster Recovery (DR) solutions to maintain business Data Integrity.
Fluent in English (Portuguese is a plus).
Full Time