Must have : Hadoop Cloudera Java SQL Scala or Spark ETL Data Quality hive
Skills required:
- Experience : 69 Years
- Must have Primary skills required are Cloudera (Hadoop) Spark Scala or Spark Java and SQL
- The resources should also have good understanding of Hive Aerospike.
- The resources should have strong analytical skills
Scope of work:
- Persistent resources will be taking the KT from the current team members on the developed framework.
- The team will need to work on the following aspects:
- Documentation of lineage as per the existing template.
- Understanding the DQ rules from the data science team.
- Onboarding of new incremental datasets along with configuration of the DQ rules etc.
- Perform data validation checks.
- Copying of production data into test environments to be confirmed.
- Report on the DQ issues