drjobs Principal Data Engineer

Principal Data Engineer

Employer Active

drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Alexander City - USA

Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Job Description

The Principal Data Engineer is responsible for driving the design development and implementation of Ambrys data infrastructure and solutions. This role will play a pivotal part in building and maintaining scalable reliable and efficient data pipelines data warehouses and data lakes. The Principal Data Engineer will collaborate closely with data architects scientists and analysts to ensure that data is accessible secure and aligned with business objectives. As a Principal Data Engineer at Ambry youll approach tasks with a customerbased cloudfirst mindset to support and enhance various data platform products including Ambrys data lakes streams and warehouses. This role will be primarily responsible for building monitoring and operationalizing our data streams which are hydrated via CDC (change data capture) from a suite of 20 onprem and cloud databases
Essential Functions
* Build Kafka connectors to sync updates from source data stores
* Build partitioned Kafka topics to sync updates to destination data marts
* Build multiplexed data analytics workloads using Apache Flink to monitor streaming metrics and perform realtime data transformations
* Build dashboards using Datadog and Cloudwatch to ensure system health and user support
* Build opinionated but accommodating schema registries that ensure data governance
* Work closely with your West Coast based scrum team to submit and review PRs daily maintain documentation and backlogs validate builds across multiple environments and deploy at a
24week sprint cadence
* Design reasonable database schemas with query access patterns as the forethought Build and maintain CI/CD pipelines using infrastructureascode
* Iteratively migrate onprem ETL jobs written in PHP into AWS
Flink and Glue processes Partner with QA Engineers in building automated test suites
* Partner with endusers to resolve service disruptions and evangelize our data product offerings Vigilantly oversee data quality and alert upstream data producers of all disparities latency and defects
* Develop and maintain the overall data platform architecture strategy roadmap and implementation plans to support the companys datadriven initiatives and business objectives.
* Design and implement scalable secure and highperformance data architectures including data warehouses data lakes and data pipelines leveraging both onpremises and cloud technologies
* Establish data governance policies standards and best practices for data management data quality data security and data privacy across the organization.
* Lead the development and implementation of realtime data streaming solutions including eventdriven architectures data ingestion transformation and consumption using technologies like Apache Kafka Apache Flink and AWS Managed Streaming for Kafka (MSK).
* Oversee the creation and maintenance of Business Intelligence
(BI) platforms data visualization tools and selfservice analytics capabilities to enable datadriven decisionmaking across the organization.
* Lead and manage a team of data engineers database administrators and data analysts fostering their professional growth promoting best practices and ensuring adherence to organizational standards and processes
* Other duties as assigned
Qualifications
* Basic understanding of genomic concepts and terminology
* Experience with PyFlink
* Experience with AWS Kinesis
* Willing to work PST hours between 8:00 AM 5:00 PM or 9:00
AM 6:00 PM
* Strong familiarity with any combination of our tech stacks in order of importance: Apache Kafka (MSK flavor preferred) Debezium Python Apache Flink or PySpark Streaming MySQL (RDS flavors preferred) Python CDK or Terraform Athena Glue Lambda Appflow HANA/4 PHP Redis Docker Javascript
* Experience building data APis and offering Data as a Service
* Experience integrating with Saas platforms such as SAP and Salesforce
* Experience or willingness to learn working with PHP MVC frameworks such as Symfony
* Experience with Atlassian products le. Jira Confluence Bamboo
* Experience with system diagramming tools such as Miro LucidCharts or Visio
* 6 years experience working with professional scrum teams and/or equivalent schooling
* 4 years experience using Git versioning control
* 3 years experience designing and indexing relational databases
* 2 years experience building and operationalizing realtime data
* Bachelors or masters degree in computer data math or life sciences or equivalent work experience

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.