Key Responsibilities:
Develop & Maintain data pipelines to support business and analytical needs.
Work with structured and unstructured data to ensure high data quality and availability.
Build scalable and efficient data workflows using SQL Python and Java/Scala.
Manage and process large datasets on AWS EMR and other cloud platforms.
Develop and maintain batch and stream data processing jobs using tools like Spark (preferred).
Deploy monitor and troubleshoot workflows on orchestration platforms like Airflow (preferred).
Ensure compliance with data governance and security standards.
Requirements
Required Skills:
Strong Proficiency in SQL for data extraction transformation and reporting.
Strong understanding of ETL pipelines
Very good programming skills in Java/Scala and Python.
Strong problemsolving and analytical skills.
Experience with Apache Spark for largescale data processing
Preferred Skills:
Handson experience with AWS EMR or similar distributed data processing frameworks.
Familiarity with Airflow for workflow orchestration.
Knowledge of AWS ecosystem services like S3 Redshift and Athena.
Experience with version control systems like Git.
Understanding of distributed systems and big data processing.
Knowledge of data lake and data warehouse concepts.
Familiarity with Agile methodologies and team collaboration tools like JIRA.
Benefits
What We Offer:
Opportunity to work on cuttingedge data engineering challenges.
A collaborative and inclusive work environment.
Professional growth and upskilling opportunities.
Competitive salary and benefits package.
Required Skills: - Strong Proficiency in SQL for data extraction, transformation, and reporting. - Strong understanding of ETL pipelines - Very good programming skills in Java/Scala and Python. - Strong problem-solving and analytical skills. - Experience with Apache Spark for large-scale data processing.