Data Engineer Python PySpark Apache Airflow NoSQL Jobs in ConsultBae India Private limited in Bangalore - India

Data Engineer Python PySpark Apache Airflow NoSQL

ConsultBae India Private Limited

Posted on : 25-09-2024

Employer Active

1 Vacancy

The job posting is outdated and position may be filled

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Bangalore - India

Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 25-09-2024

Job Description

Data Engineer (Python PySpark Apache Airflow NoSQL)

Location: Bengaluru Karnataka India (Onsite)

Experience: 35 years

Responsibilities:

Build optimize and maintain scalable ETL pipelines for data ingestion and processing.

Develop and manage workflows using Apache Airflow for scheduling and orchestrating tasks.

Work with distributed computing technologies (PySpark) to handle largescale datasets.

Design and implement data architectures that scale with growing business needs.

Implement data lake and data warehousing solutions using both structured and unstructured data.

Collaborate with data scientists and analytics teams to ensure data quality and availability.

Optimize existing data models and pipelines for performance and scalability.

Use NoSQL databases (e.g. MongoDB Cassandra) for large scalable data storage solutions.

Ensure high data integrity security and quality through monitoring and validation processes.

Write clear documentation and maintain data engineering best practices

Skills & Qualifications:

Strong proficiency in Python PySpark and SQL.

Experience working with Apache Airflow for orchestration.

Handson experience with distributed computing and big data tools (PySpark Hadoop).

Familiarity with cloud platforms (AWS GCP) and tools like S3 EMR Lambda etc.

Experience with NoSQL databases (e.g. MongoDB Cassandra) and relational databases.

Strong understanding of data warehousing concepts ETL processes and data lake architecture.

Experience with data pipeline monitoring logging and alerting.

Strong knowledge of Docker and containerized environments.

Familiarity with DevOps and CI/CD practices for data engineering.

Excellent problemsolving communication and teamwork skills.

About the Company:

CuberaTech founded in 2020 is a data company revolutionizing Big Data Analytics through a data value share paradigm where the users entrust their data to us. Our deployment of deep learning techniques enables us to harness this data making us a source of the richest Zero party data. Further by stitching together all the relevant pieces of data from zero first and secondparty sources we enable advertisers to define and create custom audiences to maximize the programmatic ROAS.

Website:

pyspark,ci/cd,nosql,cassandra,sql,devops,mongodb,apache airflow,airflow,python,aws,hadoop,docker,apache,gcp

Employment Type

Full Time

Company Industry

Key Skills

Apply Now

About Company

ConsultBae India Private Limited

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Data Engineer Python PySpark Apache Airflow NoSQL

ConsultBae India Private Limited

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Python Pyspark Data Bricks

Python Pyspark Data Bricks

Data Engineer Python PySpark Cloudera

Airflow Data Engineer

Data Engineer Python PySpark and AWS

Data Engineer Apache Spark AWS Python

Senior Data Platform Engineer Apache Airflow ETL processes Data warehousing

Data Engineer PySpark