Communicate the data needs of data scientists and other teams and come up with efficient, GDPR compliant ETL processes.
Support other teams by providing guidance on data usage, processing, and how they can best leverage the platform
Build scalable data pipelines to ingest data from a variety of data sources (Relational DBs and Data Lakes), identify critical data elements, and define data quality rules.
Leverage Spark and Kafka knowledge to design and develop capabilities to improve change data capture and real-time capabilities of the pipeline.
Provide insights on areas of improvements including data governance, best practices, large scale processing
Support the bug fixing and performance analysis and data validation and quality along the data pipeline.
Job Requirements
Who You Are?
2+ years of experience as a Data Engineer, with strong skills in at least one programming language is mandatory, preferably Scala or Java, or Python
1+ year of experience with Spark on Hadoop, EMR, etc
Experience working with real-time data processing using Kafka, Spark Streaming or similar technology
Experience with distributed systems and design/implementation for reliability, availability, scalability, and performance
Proven experience with AWS technologies like S3, EMR, Cloud information.
A creative and innovative approach to problem-solving
إخلاء المسؤولية: د.جوب هو مجرد منصة تربط بين الباحثين عن عمل وأصحاب العمل. ننصح المتقدمين بإجراء بحث مستقل خاص بهم في أوراق اعتماد صاحب العمل المحتمل.
نحن نحرص على ألا يتم طلب أي مدفوعات مالية من قبل عملائنا، وبالتالي فإننا ننصح بعدم مشاركة أي معلومات شخصية أو متعلقة بالحسابات المصرفية مع أي طرف ثالث. إذا كنت تشك في وقوع أي احتيال أو سوء تصرف، فيرجى التواصل معنا من خلال تعبئة النموذج الموجود على الصفحة اتصل بنا