Job Description: Senior Data Engineer (GCP Domain)
Position Overview:
We are seeking an accomplished Lead Data Engineer with over 12 years of specialized experience in architecting and deploying data engineering solutions on the Google Cloud Platform (GCP). The ideal candidate will exemplify strong leadership exhibiting a proven history of driving largescale highperformance data architectures to fruition while demonstrating advanced technical expertise in Python programming and extensive knowledge of SQL database technologies. This position necessitates a strategic leader who can effectively manage multifaceted projects align crossfunctional teams towards common goals and spearhead data initiatives that are pivotal to achieving business objectives.
Key Responsibilities:
- Architecture of Highly Available Distributed Systems:
- Design and construct faulttolerant distributed systems capable of managing extensive data extraction ingestion and transformation workloads. Leverage technologies such as Apache Kafka for robust realtime data streaming ensuring high availability and data integrity under varying loads. Utilize Google Cloud Pub/Sub for a reliable messaging framework that enhances system resilience.
- Facilitate architectural reviews and lead technical discussions to establish best practices in system design and implementation ensuring a standardized approach to scalability and performance.
- Cloud Data Engineering Solutions Deployment:
- Lead the complete lifecycle of data engineering initiatives within GCP including requirement analysis solution architecture implementation and ongoing optimization. Utilize Google BigQuery for efficient analytics and data warehousing applying best practices in data partitioning and clustering to optimize query performance and costeffectiveness.
- Oversee project management activities ensuring all critical timelines and deliverables are met while maintaining highquality standards through regular stakeholder engagement and review processes.
- Modern Architectural Design Implementation:
- Develop and implement innovative data solutions employing microservices and serverless architectures that enhance agility and reduce deployment times. Engage with services like Google Cloud Functions to eliminate traditional server management overhead while achieving rapid scalability.
- Conduct thorough evaluations of emerging technologies and frameworks recommending adoption strategies that improve system modularity reduce interdependencies and streamline deployment processes.
- ETL and Data Orchestration Technologies:
- Exhibit expertlevel proficiency with comprehensive ETL/Data Orchestration tools such as Azure Data Factory and opensource orchestration platforms like Apache Airflow. Develop and maintain robust ETL pipelines that automate data workflows ensuring endtoend data integrity and lineage via metadata tracking.
- Mentor team members on advanced ETL methodologies incorporating best practices for testing monitoring and maintaining data pipelines to ensure optimal performance and reliability.
- Cloud Data Warehousing and Lake Solutions:
- Design and implement scalable cloudbased data warehousing solutions utilizing leading technologies such as Snowflake and Amazon Redshift. Focus on query optimization techniques including materialized views efficient schema design and workload management strategies to enhance performance.
- Architect comprehensive data lake frameworks utilizing services such as Google Cloud Storage and Databricks enabling effective management of both structured and semistructured data while ensuring efficient access and analytics capabilities.
- SQL and NoSQL Database Management Expertise:
- Oversee the administration and optimization of complex SQL and NoSQL databases including PostgreSQL MySQL SQL Server and Oracle. Implement advanced performance optimization techniques such as indexing strategies query tuning and partitioning schemes designed to ensure high availability and data reliability.
- Lead initiatives to consolidate and migrate legacy database systems to cloudnative architectures assessing transactional workloads and strategic rollout plans that reduce operational costs while enhancing data accessibility.
- Advanced SQL Programming and Database Design:
- Develop complex SQL queries stored procedures and data transformation functions that cater to analytical requirements. Demonstrate mastery in designing database schemas using star and snowflake patterns complemented by denormalization methods to enhance query response times for analytical workloads.
- Establish and evaluate data modeling tools and methodologies to ensure adherence to industry best practices in database development and design.
- Strong Python Engineering Competence:
- Utilize Python for scripting complex data transformations executing data validation and automating routine data workflows employing libraries such as Pandas Dask and Apache Beam for optimized performance in processing large datasets.
- Integrate Python scripts seamlessly into CI/CD pipelines to guarantee automation in deployments monitoring the health of data orchestrations.
- DevOps Practices and CI/CD Tooling:
- Champion DevOps practices within the data engineering team employing Continuous Integration/Continuous Deployment (CI/CD) methodologies with tools like Jenkins GitLab CI or CircleCI to streamline development and deployment processes across environments.
- Establish best practices for version control using Git ensuring that team members are proficient in branching merging strategies and automated testing frameworks to maintain robust code quality.
- Big Data Technologies and Streaming Concepts:
- Demonstrate comprehensive familiarity with big data technologies including the Hadoop Ecosystem (HDFS Hive Pig) and realtime processing frameworks such as Apache Flink and Kafka Streams enabling scalable data processing and analytics.
- Implement governance frameworks to ensure data quality and comply with regulatory standards employing frameworks that facilitate data audits and maintain lineage tracking.
- Strategic Thought Leadership and Solution Alignment:
- Establish and align strategic objectives for data engineering initiatives with overarching business goals utilizing datadriven insights to inform decisionmaking processes and guide project priorities.
- Collaborate with crossfunctional teams product management IT and executive leadership ensuring that data solutions meet organizational needs while enhancing efficiency and creativity.
- Technical Mentorship and Leadership:
- Lead mentor and cultivate the growth of data engineering professionals within the team fostering an environment that prioritizes innovation and accountability.
- Initiate and facilitate technical workshops and knowledgesharing sessions promoting adoption of cuttingedge technologies and best practices across the team.
- Cloud Certification:
- Possession of relevant cloud certifications such as Google Professional Data Engineer or AWS Certified Solutions Architect is highly recommended as a testament to commitment and expertise in cloud data solutions.
Qualifications:
- Bachelor s or Master s degree in Computer Science Data Science Information Systems or a related field establishing a solid foundation in both theoretical and applied data engineering principles.
- At least 12 years of substantial experience in data engineering with a proven track record of timely delivery on highimpact projects in complex technical environments.
- Recognized leadership capabilities demonstrating excellence in team management conflict resolution and steering collaborative efforts toward successful project outcomes.
- Exceptional analytical communication and interpersonal skills with the aptitude to present intricate technical concepts to diverse audiences including technical teams and executive stakeholders.
If you are an accomplished data engineering leader eager to drive innovative solutions and deliver impactful results that align with strategic business goals we welcome your application for this pivotal role within our organization.
Candidates should be ready to relocate to Portugal in 6 months (Once relocated will be paid EU Salary )
nosql database management,strategic thought leadership,continuous integration,google cloud platform,sql,ci/cd methodologies,serverless architectures,cloud certifications,computer science,hdfs,google cloud pub/sub,hive,snowflake,data orchestration,data governance,etl/data orchestration tools,google cloud storage,google cloud platform (gcp),cloud certification,ci/cd tools,nosql,gcp,circleci,sql database,devops practices,amazon redshift,sql database management,cloud,kafka streams,database,python,git,pandas,management,oracle,hadoop ecosystem,data science,ci/cd,gitlab,data,devops,apache beam,nosql databases,aws certified solutions architect,big data technologies,azure data factory,technical mentorship,data engineering,continuous deployment,google,databricks,sql server,sql database technologies,design,microservices,gitlab ci,apache airflow,apache flink,dask,continuous integration/continuous deployment (ci/cd),sql programming,mysql,sql and nosql database management,python programming,sql queries,leadership,python engineering,google bigquery,governance frameworks,pig,etl,google cloud functions,information systems,jenkins,apache kafka,postgresql,python scripting,google professional data engineer certification