Mandatory Skills: Big Data GCP Apache Spark Apache Beam Location : Alpharetta GA 3 days a week Required : 10year experience . Job Title: Big Data Engineer Job Description: Client is seeking a Senior Big Data Engineer to become part of the team implementing and supporting edge analytical solutions on a Google Cloud Ecosystem. You will find a great place to work if you are passionate about designing data ingestion jobs learning new technologies and proposing and adopting new technologies. What youll do Extract Transform and Load data from multiple sources and multiple formats using Big Data Technologies. Development enhancement and support of data ingestion jobs from various source systems following existing design patterns using GCP Services such as Apache Spark DataProc Dataflow BigQuery Airflow etc. Work across Teams and senior engineers to make Data more accessible to others within the organization. Modify data extraction pipelines into standardized approaches that can be repeatable and reusable with minimal supervision from senior engineers. Automation of manual processes optimize data delivery redesigning infrastructure for greater scalability etc. Work closely with senior engineers to optimize query and data access techniques. Apply modern software development practices (server less computing microservices architecture CI/CD infrastructureascode etc.) Participate in a tightknit engineering team employing agile software development practices. What experience you need Bachelors degree in Computer Science Systems Engineering or equivalent experience. 5 years of work experience as a Big Data Engineer. 3 years of experience using Technologies such as Apache Spark Hive HDFS Beam (Optional). 3 years of experience in SQL and Scala or Python. 2 years experience with software build management tools like Maven or Gradle. 2 years of experience working with Cloud Technologies such as GCP AWS or Azure. What could set you apart Data Engineering using GCP Technologies (BigQuery DataProc Dataflow Composer DataStream etc) Experience writing data pipelines. Selfstarter that identifies/responds to priority shifts with minimal supervision Source code control management systems (e.g. SVN/Git Github) and build tools like Maven & Gradle. Agile environments (e.g. Scrum XP) Relational databases (e.g. SQL Server Oracle MySQL) Atlassian tooling (e.g. JIRA Confluence and Github) |