drjobs Data Engineer PySpark

Data Engineer PySpark

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Chennai - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. As a Data Engineer you will collaborate closely with our Data Scientists to develop and deploy machine learning models. Proficiency in below listed skills will be crucial in building and maintaining pipelines for training and inference datasets.

Responsibilities: 

Work in tandem with Data Scientists to design develop and implement machine learning pipelines. 

Utilize PySpark for data processing transformation and preparation for model training. 

Leverage AWS EMR and S3 for scalable and efficient data storage and processing. 

Implement and manage ETL workflows using Stream sets for data ingestion and transformation. 

Design and construct pipelines to deliver highquality training and inference datasets. 

Collaborate with crossfunctional teams to ensure smooth deployment and realtime/near realtime inferencing capabilities. 

Optimize and finetune pipelines for performance scalability and reliability. 

Ensure IAM policies and permissions are appropriately configured for secure data access and management. 

Implement Spark architecture and optimize Spark jobs for scalable data processing. 

Requirements: 

Mandatory

Proficiency in Advanced SQL (Window functions) Spark Architecture Pyspark or Scala with Spark Hadoop.

Proven expertise in designing and deploying data pipelines.

Strong problemsolving skills and ability to work effectively in a collaborative team environment. 

Excellent communication skills and ability to translate technical concepts to nontechnical stakeholder

Desirable

Handson experience with Airflow S3 and Stream sets or similar ETL tools.  can be trained locally

Understanding of realtime or near realtime inferencing architectures. 

  • Basic Knowledge on Kafka AWS IAM AWS EMR and Snowflake.

Total Experience Expected: 0608 years


Qualifications :

BE


Additional Information :

At our organization we are committed to fighting against all forms of discrimination. We foster a work environment that is inclusive and respectful of all differences.

All of our positions are open to people with disabilities.


Remote Work :

No


Employment Type :

Fulltime

Employment Type

Full-time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.