Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailThe Senior Data Scientist will focus on search and be dedicated to the creation of nextgeneration AI and Machine Learning techniques and strategies for LexisNexis in their global expansion. This candidate will assist with deploying ethical powerful generative AI solutions with a flexible multimodel approach that prioritizes using the best model for each individual legal use case. This approach includes working with large language models like Anthropics Claude 2 hosted on Amazon Bedrock from Amazon Web Services (AWS) and OpenAIs GPT4 and ChatGPT hosted on Microsoft Azure.
Core Technical Skills
Python Proficiency:
Expert level of Python with experience in writing efficient clean and modular code.
Ability to debug and test new code thoroughly.
RAG Systems:
Experience and deep understanding of RetrievalAugmented Generation (RAG) including concepts like embeddingbased search document retrieval and combining retrieved information with LLMs.
Handson experience with advanced RAG platform development and maintenance.
Familiarity with knowledge base creation indexing and retrieval pipelines.
Knowledge of AI Architectures:
Understanding of the endtoend architecture of generative AI systems including preprocessing retrieval ranking and postprocessing steps.
Prompt Engineering:
Expertise in crafting effective prompts for LLMs tailored to specific tasks.
Experience with techniques like zeroshot fewshot prompting prompt tuning and chain of thought.
Content Generation:
Understanding of generative AI applications in content creation including best practices for producing accurate coherent and domainspecific outputs.
Ability to finetune components for custom use cases.
Debugging and Performance Tuning:
Skills in profiling and optimizing LLM responses for latency and accuracy.
Experience diagnosing issues in complex multicomponent systems.
Monorepo and Collaboration Skills
Working in Monorepo Environments:
Experience managing and contributing to large centralized codebases (monorepos).
Understanding of version control workflows suited for monorepos (e.g. Gitbased branching strategies).
Collaboration Tools and Practices:
Proficient with CI/CD pipelines and tools like Jenkins GitHub Actions or GitLab CI.
Ability to work collaboratively with crossfunctional teams in Agile settings.
Proficiency with code review practices and tools.
AI and NLP Knowledge
NLP Expertise:
Solid understanding of transformers embeddings and attention mechanisms.
Familiarity with techniques for handling domainspecific language models.
Complementary Skills
Top Skills:
Masters or PhD Preferred!
510 years of experience in AI and machine learning model building and strong coding skills in python
2 years of working knowledge of applying recent LLMs including ChatGPT GPT 3.5 OPT BLOOM etc. UTILIZING RAG!
Experience working directly with large language models and Transformer based architectures including BERT RoBERTa T5 etc.
Experience with conversational search / semantic search reinforcement learning prompt engineering hallucination mitigation
DevOps repos Debugging building APIs and managing the algorithm flow across multiple workstreams in one repo
Senior level experience deploying models in the Cloud (AWS) or Azure as secondary.
Nice to have: Candidate local to Raleigh would be a plus but not a requirement (hybrid schedule 2x per week)
FANG Experience (Facebook Amazon Netflix Google or even Microsoft)
Secondary Skills Nice to Haves
Documentation and Communication:
Ability to write clear technical documentation for processes workflows and API usage.
Strong communication skills for conveying technical insights to stakeholders.
Preferred Experience
Previous experience working in legal tech or domainspecific generative AI use cases.
Handson experience with deploying AI models in production at scale.
Familiarity with multilingual generative AI and finetuning for specific languages like French.
Full Time