- Design develop and maintain scalable data pipelines and ETL processes on Google Cloud Platform (GCP).
- Implement and optimize data storage solutions using BigQuery Cloud Storage and other GCP services.
- Collaborate with data scientists machine learning engineers data engineer and other stakeholders to integrate and deploy machine learning models into production environments.
- Develop and maintain custom deployment solutions for machine learning models using tools such as Kubeflow AI Platform and Docker.
- Write clean efficient and maintainable code in Python and PySpark for data processing and transformation tasks.
- Ensure data quality integrity and consistency through data validation and monitoring processes. Deep understanding of Medallion architecture.
- Develop metadata driven pipelines and ensure optimal processing of data
- Use Terraform to manage and provision cloud infrastructure resources on GCP.
- Troubleshoot and resolve production issues related to data pipelines and machine learning models.
- Stay uptodate with the latest industry trends and best practices in data engineering machine learning and cloud technologies. understands data lifecycle management data pruning model drift and model optimisations.
Qualifications :
Must have Skills : Machine Learning General Experience VisualizationGoogle Cloud Platform Pyspark
- Bachelors or Masters degree in Computer Science Engineering or a related field.
- Proven experience as a Data Engineer with a focus on GCP.
- Strong proficiency in Python and PySpark for data processing and transformation.
- Handson experience with machine learning model deployment and integration on GCP.
- Familiarity with GCP services such as BigQuery Cloud Storage Dataflow and AI Platform.
- Experience with Terraform for infrastructure as code.
- Experience with containerization and orchestration tools like Docker and Kubernetes.
- Strong problemsolving skills and the ability to troubleshoot complex issues.
- Excellent communication and collaboration skills.
Additional Information :
*Preferred Qualifications:**
- Experience with custom deployment solutions and MLOps.
- Knowledge of other cloud platforms (AWS Azure) is a plus.
- Familiarity with CI/CD pipelines and tools like Jenkins or GitLab CI.
- Visualisation experience is nice to have but not mandatory.
- Certification in GCP Data Engineering or related fields.
Remote Work :
No
Employment Type :
Fulltime