Responsibilities
- Develop methods and processes for data quality assurance (QA) to ensure accuracy completeness and integrity.
- Define and implement data validation rules and automated data quality checks.
- Perform data profiling and analysis to identify anomalies outliers and inconsistencies.
- Assist in acquiring and integrating data from various sources including web crawling and API integration.
- Develop and maintain scripts in Python for data extraction transformation and loading (ETL) processes.
- Stay updated with emerging technologies and industry trends.
- Explore thirdparty technologies as alternatives to legacy approaches for efficient data pipelines.
- Contribute to crossfunctional teams in understanding data requirements.
- Assume accountability for achieving development milestones.
- Prioritize tasks to ensure timely delivery in a fastpaced environment with rapidly changing priorities.
- Collaborate with and assist fellow members of the Data Research Engineering Team as required.
- Leverage online resources effectively like StackOverflow ChatGPT Bard etc. while considering their capabilities and limitations.
Skills and Experience
- Bachelors degree in Computer Science Data Science or a related field.
- Strong proficiency in Python programming for data extraction transformation and loading.
- Proficiency in SQL and data querying is a plus.
- Knowledge of Python modules such as Pandas SQLAlchemy gspread PyDrive BeautifulSoup and Selenium sklearn Plotly.
- Knowledge of web crawling techniques and API integration.
- Knowledge of data quality assurance methodologies and techniques.
- Familiarity with machine learning concepts and techniques.
- Familiarity with HTML CSS JavaScript.
- Familiarity with Agile development methodologies is a plus.
- Strong problemsolving and analytical skills with attention to detail.
- Creative and critical thinking.
- Ability to work collaboratively in a team environment.
- Good and effective communication skills.
- Experience with version control systems such as Git for collaborative development.
- Ability to thrive in a fastpaced environment with rapidly changing priorities.
- Comfortable with autonomy and ability to work independently.
Perks:
Day off on the 3rd Friday of every month (one long weekend each month)
Monthly Wellness Reimbursement Program to promote health wellbeing
Monthly Office Commutation Reimbursement Program
Paid paternity and maternity leaves
Group Medical Insurance
Group Term Life Insurance (2.5X of the CTC)
Group Personal Accident Insurance (3 X of the CTC)
Qualifications :
Bachelors degree in Computer Science Data Science or a related field.
Remote Work :
Yes
Employment Type :
Fulltime