- Design implement and maintain robust/reliable data systems which comprise of extractions loading data from a wide variety of data sources data transformations and system monitoring.
- Align with business/client requirements
- Identify design and implement internal process improvements: automating manual processes optimizing data delivery redesigning infrastructure for greater scalability etc.
- Perform database migrations: assessments planning and implementation
- Catch up with latest technologies as well as learning establishedtechnologies in depth
Requirements
- 6 years of handson data engineering experience
- Expertise with AWS services: S3 Redshift EMR Glue Kinesis DynamoDB
- Building batch and realtime data pipelines
- Python SQL coding for data processing and analysis
- Data modeling experience using Cloud based data platforms like Redshift Snowflake Databricks
- Design and Develop ETL frameworks
- ETL development using tools like Informatica Talend Fivetran
- Creating reusable data sources and dashboards for selfservice analytics
- Experience using Databricks for Spark workloads or Snowflake
- Working knowledge of Big Data Processing
- CI/CD setup
- Infrastructureascode implementation
- Any one of the AWS Professional Certification
- Expertise with Cloud Data Services
- Batch & realtime data integration
- Star Schema Partitioning Incremental Load setup
- Building data lake and Lakehouse
- Step Functions/AMWAA/Airflow
- Building data pipelines using Spark
- SQL Python
- Realtimes pipelines with tools like Kafka/Kinesis
- Metadata management
- Data Encryption
6+ years of hands-on data engineering experience Expertise with AWS services: S3, Redshift, EMR, Glue, Kinesis, DynamoDB Building batch and real-time data pipelines Python, SQL coding for data processing and analysis Data modeling experience using Cloud based data platforms like Redshift, Snowflake, Databricks Design and Develop ETL frameworks ETL development using tools like Informatica, Talend, Fivetran Creating reusable data sources and dashboards for self-service analytics Experience using Databricks for Spark workloads or Snowflake Working knowledge of Big Data Processing CI/CD setup Infrastructure-as-code implementation Any one of the AWS Professional Certification Expertise with Cloud Data Services Batch & real-time data integration Star Schema, Partitioning, Incremental Load setup Building data lake and Lakehouse Step Functions/AMWAA/Airflow Building data pipelines using Spark SQL, Python Real-times pipelines with tools like Kafka/Kinesis Metadata management Data Encryption