Responsibilities
Design develop and maintain data pipelines using Databricks and Spark and other cloud technologies as needed
Optimize datapipelines for performance scalability and reliability
Ensure data quality and integrity throughout the data lifecycle
Collaborate with data scientists analysts and other stakeholders to understand and meet their data needs
Troubleshoot and resolve datarelated issues and provide root cause analysis and recommendations
Document data pipeline specifications requirements and enhancements and communicate them effectively to the team and management
Create new data validation methods and data analysis tools and share best practices and learnings with the data engineering community
Implement ETL processes and data warehouse solutions and ensure compliance with datagovernance and security policies Qualifications
Bachelor s degree in computer science Engineering or related field or equivalent work experience 5 years of experience in data engineering preferably with Databricks and Spark Proficient in SQL and Python and familiar with Java or Scala Experience with cloud platforms such as Azure or AWS Experience with big data technologies such as Kafka Hadoop Hive etc. ole fits into the organization overall.
data quality,java,big data technologies,data pipeline specifications,spark,data lifecycle,databricks,kafka,brick,scala,hive,azure,hadoop,data scientists,python,etl processes,cloud platforms,cloud technologies,sql,data warehouse solutions,troubleshooting,data analysts