Job Title: Python with PySpark Developer
Candidate must be on our W2
Location: Santa Clara CA (Hybrid/Onsite)
Experience Required: 410 years
Job Type: FullTime
About Purple Drive Technologies:
At Purple Drive Technologies we specialize in providing comprehensive information technology services consulting and digital solutions tailored for enterprises and system integrators. Headquartered in Irvine California with a base in India we pride ourselves on building effective teams through proven efficiencies.
Required Technical Skills:
Primary Skills:
- Strong proficiency in Python programming with a focus on data processing and analysis.
- Expertise in Apache PySpark for distributed data processing.
- Experience with data pipelines ETL processes and handling large datasets in distributed environments.
- Proficiency with data formats like JSON Avro Parquet and ORC.
- Familiarity with data manipulation libraries such as Pandas and NumPy.
- Ability to optimize and troubleshoot Spark applications for performance improvements.
- Understanding of Spark architecture including RDDs DataFrames and Spark SQL.
- Knowledge of AWS or Azure cloud services especially for deploying data workflows.
Secondary Skills:
- Familiarity with Airflow or similar orchestration tools for data pipeline automation.
- Understanding of CI/CD tools for deployment in data engineering environments.
- Basic knowledge of SQL and relational databases like MySQL or PostgreSQL.
- Experience with Git version control and Agile development methodologies.
ETL,JSON,Avro,Pandas,NumPy