drjobs PySpark Developer

PySpark Developer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Jobs by Experience drjobs

3-8years

Job Location drjobs

Mumbai - India

Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Do you love a career where you Experience Grow & Contribute at the same time while earning at least 10% above the market If so we are excited to have bumped onto you.


We are an IT Solutions Integrator/Consulting Firm helping our clients hire the right professional for an exciting long term project. Here are a few details.



Requirements

We are seeking an experienced PySpark Developer to join our data engineering team. The ideal candidate will have expertise in Apache Spark and Python programming focusing on building scalable highperformance data processing pipelines. As a PySpark Developer you will collaborate with crossfunctional teams to design build and deploy big data solutions that drive business insights and analytics.

Key Responsibilities:

  • Develop test and maintain largescale data processing systems using PySpark and other big data technologies.
  • Design and implement data pipelines to extract transform and load data from various data sources ensuring scalability and reliability.
  • Work with data scientists data analysts and other stakeholders to understand requirements and provide solutions that meet business needs.
  • Optimize and tune PySpark applications for efficient performance including memory management processing time and resource utilization.
  • Integrate data from multiple sources managing schema data transformations and data quality.
  • Participate in code reviews design discussions and performance tuning sessions to ensure highquality deliverables.
  • Document processes data flow and other key technical aspects of the developed solutions.

Required Skills and Experience:

  • Bachelor s degree in Computer Science Engineering or a related field (or equivalent experience).
  • 2 years of handson experience in PySpark Spark SQL and Spark DataFrames.
  • Proficiency in Python programming with experience in data manipulation and processing.
  • Strong knowledge of Apache Spark architecture and experience working with large datasets in a distributed environment.
  • Experience with SQL and relational databases including query optimization.
  • Familiarity with ETL frameworks and data processing tools (e.g. Hadoop Hive Kafka).
  • Experience in cloud platforms such as AWS Azure or Google Cloud and their respective big data services.
  • Understanding of data lake and data warehouse concepts and best practices.
  • Knowledge of data partitioning caching and other optimization techniques in Spark.
  • Strong problemsolving and debugging skills with attention to detail and accuracy.

Preferred Qualifications:

  • Experience with data orchestration tools (e.g. Apache Airflow).
  • Familiarity with NoSQL databases (e.g. Cassandra MongoDB).
  • Knowledge of DevOps practices including CI/CD and containerization tools (e.g. Docker Kubernetes).
  • Experience with machine learning frameworks integrated with Spark (e.g. MLlib).

Soft Skills:

  • Excellent communication skills and ability to work collaboratively in a team environment.
  • Strong analytical skills with an ability to translate complex business requirements into scalable technical solutions.
  • Ability to manage multiple projects prioritize tasks and adapt in a fastpaced environment.


Benefits



Bachelor s degree in Computer Science, Engineering, or a related field (or equivalent experience). 2+ years of hands-on experience in PySpark, Spark SQL, and Spark DataFrames. Proficiency in Python programming, with experience in data manipulation and processing. Strong knowledge of Apache Spark architecture and experience working with large datasets in a distributed environment. Experience with SQL and relational databases, including query optimization. Familiarity with ETL frameworks and data processing tools (e.g., Hadoop, Hive, Kafka). Experience in cloud platforms such as AWS, Azure, or Google Cloud and their respective big data services. Understanding of data lake and data warehouse concepts and best practices. Knowledge of data partitioning, caching, and other optimization techniques in Spark. Strong problem-solving and debugging skills, with attention to detail and accuracy.

Employment Type

Full Time

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.