Hello
We are from Mavinsys Talent Acquisition team based on One World Trade Centre New York. We are specializing in IT services and staffing majorly in lateral hiring/contract. Below is one of our requirement to fill immediately if youre interested please share your candidature to
Job Title: Senior Data Engineer
Location: New York (Remote)
Duration: 12months
Job Description;
- 59 years of relevant industry experience with a BS/Masters or 2 years with a PhD
- Experience with distributed processing technologies and frameworks such as Hadoop Spark Kafka and distributed storage systems (e.g.
HDFS S3)
- Demonstrated ability to analyze large data sets to identify gaps and inconsistencies provide data insights and advance effective product
solutions.
- Expertise with ETL schedulers such as Apache Airflow Luigi Oozie AWS Glue or similar frameworks
- Solid understanding of data warehousing concepts and handson experience with relational databases (e.g. PostgreSQL MySQL) and columnar
databases (e.g. Redshift BigQuery HBase ClickHouse)
- Design build and maintain robust and efficient data pipelines that collect process and store data from various sources including user intercation
financial details and external data feeds.
- Develop data models that enable the efficient analysis and manipulation of data for merchandising optimization. Ensure data quality consistency
and accuracy.
- Build scalable data pipelines (SparkSQL & Scala) leveraging Airflow scheduler/executor framework
- Collaborate with crossfunctional teams including Data Scientists Product Managers and Software Engineers to define data requirements and
deliver data solutions that drive merchandising and sales improvements.
- Contribute to the broader Data Engineering community at Airbnb to influence tooling and standards to improve culture and productivity Improve
code and data quality by leveraging and contributing to internal tools to automatically detect and mitigate issues.
- Effective at building partnerships with business stakeholders engineers and product to understand use cases from intended data consumers Able
to create & maintain documentation to support users in understanding how to use tables/columns
- Experience creating and evolving dimensional data models & schema designs to structure data for businessrelevant analytics.
- Strong experience using ETL framework (ex: Airflow) to build and deploy productionquality ETL pipelines.
- Experience ingesting and transforming structured and unstructured data from internal and thirdparty sources into dimensional models.
- Experience with dispersal of data to OLTP (ex: MySQL Cassandra HBase etc) and fast analytics solutions.
Data Systems Design
- Strong understanding of distributed storage and compute (S3 Hive Spark)
- Knowledge in distributed system design such as how mapreduce and distributed data processing work at scale
- Basic understanding of OLTP systems like Cassandra HBase Mussel Vitess etc.
Coding
- Experience building batch data pipelines in Spark
- Expertise in SQL
- General Software Engineering (e.g. proficiency coding in Python Java Scala)
- Experience writing data quality unit and functional tests.
- Proficiency in Salesforce and understanding of its data structure. (Optional)
- Knowledge on Salesforce Bulk Operators. (Optional)
Hadoop,Spark,Airflow,Python,SQL