Primary skillsets on AWS services with experience on EC3 S3 Redshift RDS AWS Glue/EMR Python PySpark SQL Airflow Visualization tools & Databricks.
(Lead Position)
8 Yrs. Experience
Location: PAN India
(Sr. Developer Position)
4 Yrs. Experience
Location: PAN India
Job Description (Lead position)
Role: Lead position with Primary skillsets on AWS services with experience on EC3 S3 Redshift RDS AWS Glue/EMR Python PySpark SQL Airflow Visualization tools & Databricks.
Responsibilities:
- Design and implement the data modeling data ingestion and data processing for various datasets
- Design develop and maintain ETL Framework for various new data source
- Ability to migrate the existing Talend ETL workflow into new ETL framework using AWS Glue/ EMR PySpark and/or data pipeline using python.
- Build orchestration workflow using Airflow
- Develop and execute adhoc data ingestion to support business analytics.
- Proactively interact with vendors for any questions and report the status accordingly
- Explore and evaluate the tools/service to support business requirement
- Ability to learn to create a datadriven culture and impactful data strategies.
- Aptitude towards learning new technologies and solving complex problem.
- Connect with Customer to get the requirement and ensure the timely delivery.
Qualifications:
- Minimum of bachelors degree. Preferably in Computer Science Information system Information technology.
- Minimum 8 years of experience on cloud platforms such as AWS Azure GCP.
- Minimum 8 year of experience in Amazon Web Services like VPC S3 EC3 Redshift RDS EMR Athena IAM Glue DMS Data pipeline & API Lambda etc.
- Minimum of 8 years of experience in ETL and data engineering using Python AWS Glue AWS EMR /PySpark Talend and Airflow for orchestration.
- Minimum 8 years of experience in SQL Python and source control such as Bitbucket CICD for code deployment.
- Experience in PostgreSQL SQL Server MySQL & Oracle databases.
- Experience in MPP such as AWS Redshift and EMR.
- Experience in distributed programming with Python Unix Scripting MPP RDBMS databases for data integration
- Experience building distributed highperformance systems using Spark/PySpark AWS Glue and developing applications for loading/streaming data into databases Redshift.
- Experience in Agile methodology
- Proven skills to write technical specifications for data extraction and good quality code.
- Experience with big data processing techniques using Sqoop Spark hive is additional plus
- Experience in analytic visualization tools.
- Design of data solutions on Databricks including delta lake data warehouse data marts and other data solutions to support the analytics needs of the organization.
- Should be an individual contributor with experience in above mentioned technologies
- Should be able to lead the offshore team and can ensure on time delivery code review and work management among the team members.
- Should have experience in customer communication.
Job Description (Sr. Developer Position)
Role: Sr. developer position with Primary skillsets on AWS services with experience on EC3 S3 Redshift RDS AWS Glue/EMR Python PySpark SQL Airflow Visualization tools & Databricks. He/ she will be responsible to play a key role in designing developing and optimizing data integration solutions using AWS Glue and responsible to leverage their expertise in Python PySpark and AWS services to drive the success of critical projects.
Must have:
- 4 years of experience in AWS Glue and proficiency in one or more of the following: Spark Scala Python and/or R with experience in data manipulation and transformation.
- Perform performance testing and optimization of AWS Glue jobs to meet performance and throughput requirements.
- Implement best practices for AWS Glue development to ensure scalability reliability and maintainability.
- Configure AWS Glue to integrate with database interfaces such as Redshift.
- Handson experience in SQL and experience in ETL methodology.
- Design develop and deploy ETL processes using AWS Glue Python and PySpark to extract transform and load data from S3 sources handling various flat files.
- Experience in working with AWS services S3/ATHENA/EMR/VPC/EC3.
- Experience in analytic visualization tools.
- Design of data solutions on Databricks including delta lake data warehouse data marts and other data solutions to support the analytics needs of the organization.
- Strong problemsolving skills and the ability to analyse and interpret complex data structures.
- Excellent attention to detail and a commitment to delivering highquality accurate data.
- Work on optimizing spark jobs that processes huge volumes of data.
- Work to monitor and troubleshoot data extraction and ETL processes to ensure data quality and integrity.
- Collaborate with teams to gather requirements design solutions and troubleshoot issues related to data integration and transformation.
- Work in a dynamic and fastpaced AWS environment while designing and building client facing products and solutions.
- Experience in communicating with stakeholders other technical teams and management to collect requirements describe software product features technical designs and product strategy.
- Working knowledge of distributed systems and a willingness to jump in and learn what is happening in the backend code.
- A solid grasp of fundamental algorithms and applications.
- Provide technical guidance and mentorship to junior team members fostering a culture of knowledge sharing and continuous learning.
- Good communication skills both oral and written
Engage and develop relationships with key partners and stakeholders to clearly understand the specific business &/or technology challenges initiatives and questions to be answered.
AWS services with experience on EC3, S3, Redshift, RDS, AWS Glue/EMR, Python , PySpark, SQL, Airflow, Visualization tools & Databricks.