Title Data Engineer (Ideally an expert)
Location Remote
Job Description
- ADF Will be used for ingestion pipeline and orchestration for entire pipeline
- Azure Databricks Will be used for transformation and storage. Resource should have Unity catalog knowledge as OFT is using UC for governance tag and document data assets.
- Python Will be used for writing notebooks in Databricks
- Pyspark Will be used for interfacing with Spark using Python and some data transformation.
- Datamodeling Knowledge is expected for onsite resource. Optional for offshore.
- Healthcare domain knowledge Optional
- CI/CD/DevOps Optional
- Testing using Pytest Let us set the expectations with the resources that they will have to learn Pytest on the job and automate some data validation procedures.