Mandatory Skill Talend & PySpark MDM Snowflake (Preferable)
The Data Engineer will play a crucial role in designing building and maintaining scalable data pipelines and systems to support data analytics and business intelligence initiatives. This individual will work closely with data scientists analysts and other stakeholders to ensure data is accessible accurate and actionable.
Key Responsibilities:
Data Pipeline Development: Design develop and implement robust and scalable ETL (Extract Transform Load) pipelines to integrate data from various sources into data warehouses or data lakes.
Data Modeling: Develop and maintain data models that support business requirements and enhance data accessibility and quality.
Database Management: Manage and optimize relational and nonrelational databases ensuring high performance reliability and security.
Data Quality: Monitor data quality and integrity implementing data validation and cleansing processes to ensure accuracy and consistency.
Collaboration: Work with data scientists analysts and business stakeholders to understand data needs and provide solutions to support datadriven decisionmaking.
Performance Tuning: Optimize data processing and query performance through indexing partitioning and other techniques.
Documentation: Create and maintain comprehensive documentation for data pipelines data models and other technical processes.
Troubleshooting: Identify and resolve datarelated issues ensuring minimal disruption to data workflows and business operations.
Compliance: Ensure that data handling practices comply with relevant regulations and industry standards.
Innovation: Stay current with emerging technologies and industry trends recommending and implementing new tools and practices to enhance data engineering capabilities.
Qualifications:
Education: Bachelors degree in Computer Science Engineering Mathematics or a related field. Advanced degrees or certifications are a plus.
Experience: Minimum of X years of experience as a Data Engineer or in a similar role with a proven track record of managing and optimizing data pipelines and databases.
Technical Skills:
o Proficiency in SQL and experience with relational databases (e.g. PostgreSQL MySQL).
o Experience with big data technologies (e.g. Hadoop Spark) and data warehousing solutions (e.g. Amazon Redshift Snowflake).
o Familiarity with ETL tools and frameworks (e.g. Apache NiFi Talend).
o Knowledge of programming languages such as Python Java or Scala.
o Experience with cloud platforms (e.g. AWS Azure Google Cloud Platform) and their data services.
Analytical Skills: Strong problemsolving abilities with a keen eye for detail and a commitment to delivering highquality results.
Communication: Excellent verbal and written communication skills with the ability to effectively collaborate with crossfunctional teams and present technical information to nontechnical stakeholders.
Project Management: Ability to manage multiple tasks and projects simultaneously with strong organizational and timemanagement skills.
Preferred Qualifications:
Experience with data visualization tools (e.g. Tableau Power BI).
Knowledge of data governance and data privacy practices.
Familiarity with DevOps practices and tools (e.g. Docker Kubernetes).