It is very Urgent role
Title Lead Data Engineer
Location Remote however if someone is nearby to Rahway NJ that candidate will have to go onsite.
Duration 12 month
Lead experience required 23 years
Must have total 10 exp. needed
MustHave:
Databricks expert need an experienced resource who can provide solutions and give recommendations to client.
Pyspark Super important
SQL Super important
Airflow good experience
Experience
Experience in implementation of AWS Data Lake and Data Publication using Databricks Airflow and AWS S3 Experience in Databricks Data Engineering to create Data Lake solutions using AWS services. Knowledge of Databricks cluster and SQL warehouse Experience in Delta and Parquet file handling Experience in Data Engineering and Data Pipeline creation on Databricks Experience in Data Build Tool (DBT) using Python and SQL Extensive Experience in SQL PL/SQL complex Join Aggregation function and DBT Python Data frames and Spark Experience in Airflow for Job Orchestration dependency Setup and Job scheduling Knowledge of Databricks Unity Catalog and Consumption patterns Knowledge of GitHub and CI/CD Pipelines AWS Infra like IAM Role Secrets and S3 buckets Role
- Responsible for defining technical architecture and application landscape.
- Responsible for authoring SQL and Python scripts on Databricks and DBT (Data Build tool) to create data pipelines to create Operational Data Mart.
- Responsible for creation of Data Pipelines for Data processing of Delta files into ODM format for downstream data consumption
- Responsible for identifying data set relationship join criteria and implement it in code for ODM model development.
- Responsible for creation of Delta Lake for ODM model and setup of consumption pattern using Databricks Unity catalog
- Responsible for creation of Airflow DAGs for job orchestration and scheduling of data pipeline jobs