Role: Lead Data Engineer
Location: Hyderabad Hybrid 3 days
Type: Full Time
Job Description
- Design and develop data pipelines using Python Pandas SQL and NoSQL for RealWorld Evidence (RWE) and NextGeneration Sequencing (NGS) data.
- Perform data cleaning preprocessing and exploratory data analysis (EDA) on genomic and clinical datasets.
- Manage and operate within AWS cloud services for pharmaceutical research and development
- Ensure high code quality through linting documentation and static code analysis tools
- Develop unit tests for high code coverage of drug discovery and development pipelines
- Maintain source control using Git including daily commits and PR processes for collaborative drug development
- Utilize Docker for containerization of drug formulation and delivery applications
- Implement CI/CD pipelines for automated deployment of pharmacogenomics and ADME analysis tools
- Work within Agile processes particularly in Scrum teams for efficient drug development lifecycle management
- Ensure compliance with GxP (FDA EMA ISO etc regulations) and data privacy laws like HIPAA for clinical trials and postmarket surveillance