- Contribute to maintaining updating and expanding existing Core Data platform data pipelines
- Build and maintain APIs to expose data to downstream applications
- Develop realtime streaming data pipelines
- Tech stack includes Airflow Spark Databricks Delta Lake and Snowflake
- Collaborate with product managers architects and other engineers to drive the success of the Core Data platform
- Contribute to developing and documenting both internal and external standards and best practices for pipeline configurations naming conventions and more
- Ensure high operational efficiency and quality of the Core Data platform datasets to ensure our solutions meet SLAs and project reliability and accuracy to all our stakeholders (Engineering Data Science Operations and Analytics teams)
Requirements
- Data engineering experience developing large data pipelines
- Proficiency in at least one major programming language (e.g. Python Java Scala)
- Handson production environment experience with distributed processing systems such as Spark
- Handson production experience with data pipeline orchestration systems such as Airflow for creating and maintaining data pipelines
- Experience with at least one major Massively Parallel Processing (MPP) or cloud database technology (Snowflake Databricks Big Query).
- Experience in developing APIs with GraphQL
- Advance understanding of OLTP vs OLAP environments
- Graph Database experience a plus
- Realtime Event Streaming experience a plus
Benefits
- Employer provides access to:
- 3 levels of medical insurance for you and your family
- Dental insurance for you and your family
- 401k
- Overtime
- California has the following sick leave policy: accrue 1 hour for every 30 hours worked up to 48 hours.
Apache Spark, Databricks, Airflow, Snowflake, Python