This is a remote position.
We are seeking a Data Infrastructure Engineer to design build and optimize our data systems supporting the needs of our scientists developers and data teams. In this role you will take ownership of our data pipelines drive automation and enhance our infrastructure to enable scalable secure and efficient data access.
Role Explanation: (please watch before applying)
Key Responsibilities
- Collaborate Across Teams: Work closely with scientists and developers to design build and maintain scalable data stores (S3 EFS Box) and relational databases.
- Set the Data Infrastructure Vision: Define and execute the strategy for data infrastructure engineering leading both independent and teamdriven projects.
- Optimize Data Pipelines: Develop and maintain efficient data pipelines to facilitate seamless access for machine learning and biostatistics teams.
- Infrastructure & Tooling: Build core infrastructure tooling and software development processes in Python to support data operations.
- Platform Engineering: Contribute to various platform engineering projects from shortterm solutions to longterm system development.
Required Experience & Skills
- Proven experience ensuring data governance and security best practices when handling sensitive and confidential data.
- Handson experience managing AWS relational databases such as PostgreSQL MySQL and similar systems.
- Expertise in managing S3 buckets including lifecycle management IAM policies and permissions at scale.
- Experience designing and deploying InfrastructureasCode (IaC) solutions using Terraform.
- Some familiarity with CI/CD pipelines containers and Kubernetes (bonus points for AWS ECS experience).
- 13 years of experience in data infrastructure engineering or related fields.