Role 2: Data Scientist with ML Ops
Location: Remote
Key Responsibilities:
1. Model Development & Validation:
- Design build and validate machine learning models tailored to business needs.
- Experiment with various algorithms and techniques to improve model accuracy and performance.
- Collaborate with data analysts and stakeholders to understand data requirements and objectives.
2. MLOps Implementation:
- Develop and implement MLOps strategies to manage the full lifecycle of machine learning models.
- Automate the deployment monitoring and scaling of ML models using MLOps tools and practices.
- Ensure models are deployed in realtime production environments maintaining high availability and performance.
3. Data Pipeline Development:
- Build and manage robust data pipelines to support model training testing and deployment.
- Design workflows to handle data ingestion preprocessing and transformation.
- Implement data quality and validation checks to ensure the accuracy and consistency of data used for modeling.
4. Performance Monitoring & Optimization:
- Monitor the performance of deployed models in realtime and address any issues related to model drift degradation or failures.
- Continuously evaluate and optimize model performance through tuning and retraining as needed.
- Develop and maintain performance metrics and dashboards to track model effectiveness.
5. Collaboration & Communication:
- Work closely with crossfunctional teams including data engineers software developers and business stakeholders to deliver datadriven solutions.
- Translate complex technical concepts into actionable insights for nontechnical stakeholders.
- Provide technical guidance and support to team members as required.
6. Documentation & Knowledge Sharing:
- Create and maintain comprehensive documentation for models pipelines and MLOps processes.
- Share knowledge and best practices with team members to foster a culture of continuous learning and improvement.
- Stay updated on industry trends emerging technologies and best practices in data science and MLOps.
7. Troubleshooting & Support:
- Diagnose and resolve issues related to model performance deployment and integration.
- Provide ongoing support and maintenance for deployed models and data pipelines.
- Conduct root cause analysis and implement corrective actions to address issues.
MustHave Qualifications:
- Educational Background: Bachelors or masters degree in computer science Data Science Engineering Mathematics or a related field.
- Experience: 69 years of experience in data science with a strong focus on MLOps and productionizing machine learning models.
- Programming Skills: Proficiency in Python for data analysis and machine learning.
- Machine Learning Expertise: Deep understanding of machine learning algorithms statistical modeling and model evaluation techniques.
- MLOps Knowledge: Very good knowledge of MLOps principles tools and practices including realtime usage and deployment strategies. Handson experience with MLOps platforms such as MLflow Kubeflow TensorFlow Serving or similar.
- Cloud Platforms: Experience with major cloud providers (AWS Azure Google Cloud) for deploying and managing machine learning models.
- Data Engineering Skills: Solid understanding of data engineering principles including ETL processes data warehousing and SQL.
- Version Control: Proficiency in using version control systems such as Git for code management.
- Communication Skills: Strong verbal and written communication skills with the ability to present technical information to diverse audiences.
GoodtoHave Qualifications:
- Big Data Technologies: Experience with big data tools and technologies like Hadoop Spark or Kafka.
- Containerization & Orchestration: Familiarity with Docker Kubernetes or other containerization and orchestration technologies.
- DevOps Practices: Knowledge of DevOps methodologies and tools such as Jenkins Terraform or CI/CD pipelines.
- Business Acumen: Ability to understand and translate business requirements into technical solutions and model designs.