Position Overview
We are looking for an experienced AWS Data Engineer to design develop and maintain data solutions using core AWS services. The ideal candidate will have handson experience with tools like Amazon S3 Redshift AWS Glue and DynamoDB and will build scalable efficient and secure data pipelines and architectures. The role also requires strong expertise in PySpark SQL and workflow orchestration tools like Airflow.
Key Responsibilities
1. Data Pipeline Development
Develop and manage ETL/ELT workflows using AWS Glue and PySpark to process large datasets.
Automate data workflows using Apache Airflow and other orchestration tools.
2. Data Storage and Management
Architect and manage data storage in Amazon S3 ensuring performance costefficiency and security.
Create and optimize Amazon Redshift clusters for data warehousing and analytics workloads.
Design scalable NoSQL solutions using Amazon DynamoDB for realtime data needs.
3. Compute and Serverless
Build and deploy serverless solutions using AWS Lambda for eventdriven data processing.
Configure and manage virtual machine instances using Amazon EC2 for custom data processing tasks.
4. Security and Monitoring
Implement finegrained access controls with AWS IAM to ensure data security.
Set up monitoring logging and alerts using AWS CloudWatch for proactive system health management.
5. Data Integration and Transformation
Create efficient SQL queries to handle data transformations and analytics.
Integrate and process structured and unstructured data from multiple sources.
Design data models and implement them in Redshift or DynamoDB for optimized query performance.
6. Collaboration and Optimization
Collaborate with data scientists analysts and stakeholders to gather requirements and deliver solutions.
Continuously optimize data processing workflows to improve performance and reduce costs.