Key Responsibilities
Kubernetes Cluster Management & Orchestration
- Architecture & Design: Lead the design deployment and scaling of containerized applications within Kubernetes clusters to achieve high availability fault tolerance and scalability.
- Advanced Configuration: Develop and maintain Helm charts YAML configurations and custom Kubernetes resources for efficient application deployment and scaling.
- Resource Optimization: Implement resource allocation strategies and optimize Kubernetes clusters to meet dynamic demand balancing workloads across multiple environments.
- Security & Compliance: Oversee security best practices for Kubernetes including implementing rolebased access control (RBAC) securing secrets and maintaining secure namespaces in line with industry standards.
Airflow Workflow Automation & Optimization
- Advanced Workflow Design: Architect complex dynamic workflows in Apache Airflow utilizing Directed Acyclic Graphs (DAGs) to automate data pipelines (ETL reporting and machine learning).
- Scalable Automation: Implement advanced scheduling monitoring and errorhandling strategies to ensure the reliable execution of workflows including leveraging Airflow s KubernetesExecutor and CeleryExecutor for distributed task execution.
- EndtoEnd Automation: Design and manage the lifecycle of data workflows from ingestion to processing ensuring seamless data pipeline orchestration with minimal manual intervention.
Kubernetes & Airflow Integration
- Dynamic Scaling & Load Management: Seamlessly integrate Airflow and Kubernetes to allow for dynamic scaling of workflows adjusting to varying computational loads and ensuring optimal performance for complex pipelines.
- Infrastructure as Code: Utilize tools such as Terraform or Ansible to automate infrastructure provisioning and configuration streamlining the setup and management of Kubernetes clusters and Airflow instances.
Monitoring & Performance Tuning
- Operational Excellence: Lead efforts to implement comprehensive monitoring solutions using tools like Prometheus Grafana and the ELK stack to ensure continuous health checks performance monitoring and resource utilization across Kubernetes and Airflow environments.
- Optimization: Continuously optimize the operational performance of both Kubernetes and Airflow environments making datadriven decisions to enhance resource usage reduce overhead and maximize uptime.
Security & Governance
- Advanced Security Practices: Ensure the implementation of robust security controls including encryption at rest and in transit Airflow s authentication mechanisms and securing Kubernetes clusters against vulnerabilities.
- Governance & Best Practices: Drive adherence to best practices for governance ensuring compliance with industry standards and organizational policies.
CrossFunctional Collaboration & Mentorship
- Leadership & Guidance: Mentor junior and midlevel engineers in best practices for Kubernetes and Airflow fostering a culture of continuous learning and improvement.
- Collaboration: Work closely with data engineers DevOps teams and software developers to align workflows with business requirements ensuring the seamless operation of integrated solutions across the tech stack.
Required Qualifications
- Extensive Kubernetes Expertise:
- Proven experience in the design deployment and optimization of largescale Kubernetes clusters including multicloud environments.
- Handson experience with Helm charts advanced YAML configurations and Kubernetes networking scaling and storage strategies.
- Familiarity with Kubernetesnative solutions for monitoring logging and security management.
- Airflow Mastery:
- Expertise in developing and managing complex Apache Airflow DAGs to automate data pipelines including ETL reporting and machine learning workflows.
- Indepth experience with Airflow s KubernetesExecutor and CeleryExecutor for distributing tasks across multiple nodes.
- Proven track record in scaling Airflow in production environments and optimizing workflow performance.
- Cloud Platform Proficiency:
- Extensive experience with Kubernetes in cloud environments such as AWS (EKS) Azure (AKS) or Google Cloud (GKE).
- Strong understanding of integrating Airflow with cloud services for endtoend automation.
- Programming & Scripting Skills:
- Proficiency in Python (especially for Airflow DAGs) and Bash scripting to support workflow automation and infrastructure management.
- Experience with Infrastructure as Code (IaC) tools such as Terraform or Ansible.
- Advanced Monitoring & Security:
- Deep knowledge of Prometheus Grafana and ELK stack for monitoring Kubernetes and Airflow performance.
- Expertise in Kubernetes and Airflow security best practices including encryption RBAC and access control.
Preferred Qualifications
- CI/CD Expertise: Experience in building and maintaining CI/CD pipelines with tools such as Jenkins GitLab CI or similar to automate the deployment of Kubernetes clusters and Airflow workflows.
- DevOps Methodologies: Understanding of modern DevOps practices including containerization orchestration and continuous automation across the development lifecycle.
- Database Knowledge: Experience with relational databases such as PostgreSQL or MySQL for managing Airflow metadata and supporting data pipeline operations.
- Leadership & Mentorship: Strong communication skills and a proven ability to mentor and guide junior engineers fostering a collaborative and highperforming team culture.
jenkins,kubernetes,mysql,python,elk stack,ansible,helm charts,infrastructure,terraform,grafana,postgresql,security,gitlab ci,automation,yaml configurations,prometheus,cloud,directed acyclic graphs (dags),ci/cd pipelines,airflow,bash scripting