Skill Set:
Cloud Azure DWHData BricksKubernetes
Skill to Evaluate:
Cloud Azure DWHData BricksKubernetesHadoopDockerDataOpsTestingMonitoringData EngineeringProductsData ModellingRDBMSDatabricks
Location:
Dublin Role: USA Data Engineer & ML/DevOps As a Principal Data Engineer your responsibilities will include:
- Design and build data pipelines to process terabytes of data
- Orchestrate in Airflow the data tasks to run on Kubernetes/Hadoop for the ingestion processing and cleaning of data.
- Create Docker images for various applications and deploy them on Kubernetes
- Design and build best in class processes to clean and standardize data.
- Troubleshoot production issues in our Elastic Environment
- Tuning and optimizing data processes Advancing the team DataOps culture (CI/CD Orchestration Testing Monitoring) and building out standard development patterns.
- Drive innovation by testing new technology and approaches to continually advance the capability of the data engineering function.
- Drive efficiencies in current engineering processes via standardization and migration of existing onpremise processes to the cloud Ensuring Data Quality
- Building best in class data quality monitoring that ensure that all data products exceed customer expectations. Required Qualifications: Computer Science bachelor"s degree or similar. Good understanding of Data Modelling techniques i.e. DataVault Kimble Star
- Excellent understanding of ColumnStore RDBMS (DataBricks Snowflake Redshift Vertica Clickhouse)
- Good experience handling realtime near realtime and batch data ingestions
- Hands on experience on the following technologies: o Developing processes in Spark o Writing complex SQL queries f o Building ETL/data pipelines o Exposure to Kubernetes and Linux containers (i.e. Docker) o Related/complementary open source software platforms and languages (e.g. Scala Python Java Linux)
- Proven track record of designing effective data strategies and leveraging modern data architectures that resulted in business value
- Experience building cloudnative data pipelines on either AWS Azure or GCP following best practices in cloud deployments Strong DataOps experience (CI/CD Orchestration Testing Monitoring)
- Strong experience leading and developing data engineering teams
- Demonstrated effective interpersonal influence collaboration and listening skills
- Strong stakeholder management skills
- Excellent time management organizational and prioritization skills with ability to balance multiple priorities. Preferred Qualifications: Experience with data tokenization and different techniques and tools i.e. DataVant Protegrity
- Experience with Azure Data Factory Databricks and Snowflake
- Experience with Apache Spark and related Big Data stack and technologies PySpark Scala
- Experience working with Apache Kafka building appropriate producer/consumer apps
- Experience working with Kubernetes and Docker and knowledgeable about cloud infrastructure automation and management (e.g. Terraform)
- Experience working in projects with agile/scrum methodologies
- Familiarity with production quality ML and/or AI model development and deployment.
- Healthcare industry knowledge and experience with exposure to EDI HIPAA HL7 and FHIR integration standards
ML, data engineer , DevOps