Job Role: Data Engineer
Location:Pune GGN Bangalore Chennai
Exp:48 yrs
NP:immediate joiner and October 1st week candidate
Work Mode: Hybrid
Mandatory skills Pyspark AWS Python
Key Skills
- Design develop maintain efficient and scalable solutions using PySpark
- Ensure data quality and integrity by implementing robust testing validation and cleansing processes
- Integrate data from various sources including databases APIs external datasets etc.
- Optimize and tune PySpark jobs for performance and reliability
- Document data engineering processes workflows and best practices
- Strong understanding of databases data modelling and ETL tools and processes
- String programming skills in python and proficiency with PySpark SQL
- Experience with relational databases Spark AWS Python skill
- Excellent communication and collaboration skills
Key Responsibilities:
- Design and Development: Create develop and maintain robust solutions using PySpark to handle largescale data processing.
- Data Quality Assurance: Implement thorough testing validation and cleansing processes to ensure data quality and integrity.
- Data Integration: Integrate data from diverse sources including databases APIs and external datasets to create unified data solutions.
- Performance Optimization: Optimize and tune PySpark jobs for maximum performance and reliability.
- Documentation: Document data engineering processes workflows and best practices to enhance team collaboration and knowledge sharing.
- Database Management: Utilize strong understanding of databases data modeling and ETL processes to support data architecture needs.
- Programming Expertise: Leverage programming skills in Python and proficiency with SQL and PySpark for effective data manipulation and analysis.
- Collaboration: Work closely with crossfunctional teams to understand data requirements and deliver solutions that meet business needs.
aws,database management,api,python,data modeling,sql,databases,pyspark