We are seeking a highly skilled and experienced Lead Data Engineer (3 to 8) yrs to join our dynamic team. As a Lead Data Engineer you will play a crucial role in designing developing and maintaining our data infrastructure. You will be responsible for ensuring the efficient and reliable collection storage and transformation of largescale data to support business intelligence analytics and datadriven decisionmaking.
Key Responsibilities:
- Data Architecture & Design:
- Lead the design and implementation of robust data architectures that support data warehousing (DWH) data integration and analytics platforms.
- Develop and maintain ETL (Extract Transform Load) pipelines to ensure the efficient processing of large datasets.
- ETL Development:
- Design develop and optimize ETL processes using tools like Informatica Power Center Intelligent Data Management Cloud (IDMC) or custom Python scripts.
- Implement data transformation and cleansing processes to ensure data quality and consistency across the enterprise.
- Data Warehouse Development:
- Build and maintain scalable data warehouse solutions using Snowflake Databricks Redshift or similar technologies.
- Ensure efficient storage retrieval and processing of structured and semistructured data.
- Big Data & Cloud Technologies:
- Utilize AWS Glue and PySpark for largescale data processing and transformation.
- Implement and manage data pipelines using Apache Airflow for orchestration and scheduling.
- Leverage cloud platforms (AWS GCP) for data storage processing and analytics.
- Data Management & Governance:
- Establish and enforce data governance and security best practices.
- Ensure data integrity accuracy and availability across all data platforms.
- Implement monitoring and alerting systems to ensure data pipeline reliability.
- Collaboration & Leadership:
- Work closely with data Stewards analysts and business stakeholders to understand data requirements and deliver solutions that meet business needs.
- Mentor and guide junior data engineers fostering a culture of continuous learning and development within the team.
- Lead datarelated projects from inception to delivery ensuring alignment with business objectives and timelines.
- Database Management:
- Design and manage relational databases (RDBMS) to support transactional and analytical workloads.
- Optimize SQL queries for performance and scalability across various database platforms.
Required Skills & Qualifications:
- Education: Bachelors or Masters degree in Computer Science Information Systems Engineering or a related field.
- Experience:
- Minimum of 4 years of experience in data engineering ETL and data warehouse development.
- Proven experience with ETL tools like Informatica Power Center or IDMC.
- Strong proficiency in Python and PySpark for data processing.
- Experience with cloudbased data platforms such as AWS Glue Snowflake Databricks or Redshift.
- Handson experience with SQL and RDBMS platforms (e.g. Oracle MySQL PostgreSQL).
- Familiarity with data orchestration tools like Apache Airflow.
- Technical Skills:
- Advanced knowledge of data warehousing concepts and best practices.
- Strong understanding of data modeling schema design and data governance.
- Proficiency in designing and implementing scalable ETL pipelines.
- Experience with cloud infrastructure (AWS GCP) for data storage and processing.
- Soft Skills:
- Excellent communication and collaboration skills.
- Ability to lead and mentor a team of engineers.
- Strong problemsolving and analytical thinking abilities.
- Ability to manage multiple projects and prioritize tasks effectively.
Preferred Qualifications:
- Experience with machine learning workflows and data science tools.
- Certification in AWS Snowflake Databricks or relevant data engineering technologies.
- Experience with Agile methodologies and DevOps practices.
data warehouse development,big data technologies,cloud technologies,data engineering,data,data governance,etl,aws,apache airflow,pyspark,python,sql,etl development