drjobs Cloud Data Engineer- Spark Databricks

Cloud Data Engineer- Spark Databricks

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Dangs (India) - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Job Title: Cloud Engineer Spark/Databricks Specialist
Location: Remote
Job Type: Contract
Industry: IT/Cloud Engineering

Job Summary:

We are looking for a highly skilled Cloud Engineer with a specialization in Apache Spark and Databricks to join our dynamic team. The ideal candidate will have extensive experience working with cloud platforms such as AWS Azure and GCP and a deep understanding of data engineering ETL processes and cloudnative tools. Your primary responsibility will be to design develop and maintain scalable data pipelines using Spark and Databricks while optimizing performance and ensuring data integrity across diverse environments.

Key Responsibilities:

Design and Development:

  • Architect develop and maintain scalable ETL pipelines using Databricks Apache Spark (Scala Python) and other cloudnative tools such as AWS Glue Azure Data Factory and GCP Dataflow.
  • Design and build data lakes and data warehouses on cloud platforms (AWS Azure GCP).
  • Implement efficient data ingestion transformation and processing workflows with Spark and Databricks.
  • Optimize the performance of ETL processes for faster data processing and lower costs.
  • Develop and manage data pipelines using other ETL tools such as Informatica SAP Data Intelligence and others as needed.

Data Integration and Management:

  • Integrate structured and unstructured data sources (relational databases APIs ERP systems) into the cloud data infrastructure.
  • Ensure data quality validation and integrity through rigorous testing.
  • Perform data extraction and integration from SAP or ERP systems ensuring seamless data flow.

Performance Optimization:

  • Monitor troubleshoot and enhance the performance of Spark/Databricks pipelines.
  • Implement best practices for data governance security and compliance across data workflows.

Collaboration and Communication:

  • Collaborate with crossfunctional teams including data scientists analysts and business stakeholders to define data requirements and deliver scalable solutions.
  • Provide technical guidance and recommendations on cloud data engineering processes and tools.

Documentation and Maintenance:

  • Document data engineering solutions ETL pipelines and workflows.
  • Maintain and support existing data pipelines ensuring they operate effectively and align with business goals.

Qualifications:

Education:

  • Bachelors degree in Computer Science Information Technology or a related field. Advanced degrees are a plus.

Experience:

  • 7 years of experience in cloud data engineering or similar roles.
  • Expertise in Apache Spark and Databricks for data processing.
  • Proven experience with cloud platforms like AWS Azure and GCP.
  • Experience with cloudnative ETL tools such as AWS Glue Azure Data Factory Kafka GCP Dataflow etc.
  • Handson experience with data platforms like Redshift Snowflake Azure Synapse and BigQuery.
  • Experience in extracting data from SAP or ERP systems is preferred.
  • Strong programming skills in Python Scala or Java.
  • Proficient in SQL and query optimization techniques.

Skills:

  • Indepth knowledge of Spark/Scala for highperformance data processing.
  • Strong understanding of data modeling ETL/ELT processes and data warehousing concepts.
  • Familiarity with data governance security and compliance best practices.
  • Excellent problemsolving communication and collaboration skills.

Preferred Qualifications:

  • Certifications in cloud platforms (e.g. AWS Certified Data Analytics Google Professional Data Engineer Azure Data Engineer Associate).
  • Experience with CI/CD pipelines and DevOps practices for data engineering.
  • Exposure to Apache Hadoop Kafka or other data frameworks is a plus.

gcp,data integration,data modeling,compliance,collaboration,sql,security,data warehousing,data,cloud,performance optimization,devops,databricks,spark,kafka,apache hadoop,ci/cd pipelines,data lakes,data engineering,apache spark,data management,azure,communication,gcp dataflow,data quality,data warehouses,documentation,scala,azure data factory,python,query optimization,problem-solving,data governance,aws,etl,etl/elt processes,aws glue,informatica,sap data intelligence,etl processes

Employment Type

Full Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.