Job Title: Senior Data Engineer
Location: San Francisco onsite
No. of years of experience: 8 years
hat youll Do:
- Create and maintain optimal data pipeline architecture
- Build data pipelines that transform raw unstructured data into formats that data analyst can use to for analysis
- Assemble large complex data sets that meet functional / nonfunctional business requirements
- Identify design and implement internal process improvements: automating manual processes optimizing data delivery redesigning infrastructure for greater scalability etc.
- Build the infrastructure required for optimal extraction transformation and delivery of data from a wide variety of data sources using SQL and AWS Big Data technologies
- Work with stakeholders including the Executive Product Engineering and program teams to assist with datarelated technical issues and support their data infrastructure needs.
About you:
- 6 yrs experience and bachelors degree in computer science Informatics Information Systems or a related field; or equivalent work experience
- Indepth working experience of distributed systems Hadoop/MapReduce Spark Hive Kafka and Oozie/Airflow
- At least 5 years of solid production quality coding experience in data pipeline implementation in Java Scala and Python
- Experience with AWS cloud services: EC2 EMR RDS
- Experience in GIT JIRA Jenkins Shell scripting
- Familiar with Agile methodology testdriven development source control management and test automation
- Experience supporting and working with crossfunctional teams in a dynamic environment
- Youre passionate about data and building efficient data pipelines
- You have excellent listening skills and are empathetic to others
- You believe in simple and elegant solutions and give paramount importance to quality You have a track record of building fast reliable and highquality data pipelines
Nice to have skills:
- Experience building Marketing Data pipelines including Direct Mail will be a big plus
- Experience with Snowflake and Salesforce Marketing Cloud
- Working knowledge of opensource ML frameworks and endtoend model development life cycle
- Previous working experience with running containers (Docker/LXC) in a production environment using one of the container orchestration services (Kubernetes AWS ECS AWS EKS)