This is a remote position.
**We are looking for talent that is based in Mexico Colombia and other Latin American Countries**
Job Overview:
We are seeking a highly skilled Azure Data Engineer with extensive experience in Databricks PySpark and Spark SQL to join our team. The ideal candidate will be responsible for designing developing and maintaining largescale data solutions on Azure with a focus on realtime and batch data processing using Databricks.
Requirements
Key Responsibilities:
- Design build and maintain scalable data pipelines on Azure Databricks using PySpark and Spark SQL.
- Develop and optimize ETL processes for handling large data sets.
- Collaborate with data scientists analysts and other stakeholders to understand data requirements and implement solutions.
- Ensure data quality performance and scalability by implementing best practices for coding and data architecture.
- Create and manage data models databases and data lakes on Azure.
- Monitor and troubleshoot data pipelines ensuring high availability and reliability.
- Implement security and compliance best practices in the data pipeline following Azure standards.
- Work with Azure services like Azure Data Factory Azure Synapse and Azure Blob Storage to orchestrate and manage data workflows.
Qualifications:
- Bachelors degree in Computer Science Information Technology or a related field.
- 5 years of handson experience with Azure Databricks.
- Proficiency in PySpark and Spark SQL for building scalable data pipelines.
- Solid understanding of Azure data ecosystem including Azure Data Lake Azure Data Factory and Azure Synapse Analytics.
- Experience with ETL processes data modeling and data architecture.
- Familiarity with cloud security best practices and data governance.
- Strong problemsolving and troubleshooting skills.
- Excellent communication and collaboration skills.
NicetoHave:
- Experience with CI/CD pipelines using tools like Azure DevOps.
- Knowledge of Data Warehousing and Big Data technologies.
- Experience with other programming languages like Python or Scala.
- Familiarity with Machine Learning workflows on Databricks.
Key Responsibilities: Design, build, and maintain scalable data pipelines on Azure Databricks using PySpark and Spark SQL. Develop and optimize ETL processes for handling large data sets. Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and implement solutions. Ensure data quality, performance, and scalability by implementing best practices for coding and data architecture. Create and manage data models, databases, and data lakes on Azure. Monitor and troubleshoot data pipelines, ensuring high availability and reliability. Implement security and compliance best practices in the data pipeline, following Azure standards. Work with Azure services like Azure Data Factory, Azure Synapse, and Azure Blob Storage to orchestrate and manage data workflows. Qualifications: Bachelor's degree in Computer Science, Information Technology, or a related field. 5+ years of hands-on experience with Azure Databricks. Proficiency in PySpark and Spark SQL for building scalable data pipelines. Solid understanding of Azure data ecosystem, including Azure Data Lake, Azure Data Factory, and Azure Synapse Analytics. Experience with ETL processes, data modeling, and data architecture. Familiarity with cloud security best practices and data governance. Strong problem-solving and troubleshooting skills. Excellent communication and collaboration skills. Nice-to-Have: Experience with CI/CD pipelines using tools like Azure DevOps. Knowledge of Data Warehousing and Big Data technologies. Experience with other programming languages like Python or Scala. Familiarity with Machine Learning workflows on Databricks.