drjobs LLM Data Engineer

LLM Data Engineer

Employer Active

drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Alexander City - USA

Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Job Description

At Orange People we are revolutionizing the way data and language interact through advanced AI technologies. Were seeking a skilled LLM Data Engineer to join our innovative team where you will play a critical role in developing and optimizing large language models. In this position you will work at the intersection of data engineering and machine learning harnessing your expertise to enhance our AI capabilities and drive impactful solutions. If you are passionate about working with large datasets optimizing algorithms and contributing to the future of natural language processing we invite you to be part of our journey at Orange People!

Responsibilities:

  • Design implement and maintain an endtoend multistage data pipeline for LLMs including Supervised Fine Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) data processes .
  • Identify evaluate and integrate diverse data sources and domains to support the Generative AI platform.
  • Develop and optimize data processing workflows for chunking indexing ingestion and vectorization for both text and nontext data.
  • Benchmark and implement various vector stores embedding techniques and retrieval methods.
  • Create a flexible pipeline supporting multiple embedding algorithms vector stores and search types (e.g. vector search hybrid search).
  • Implement and maintain autotagging systems and data preparation processes for LLMs.
  • Develop tools for text and image data crawling cleaning and refinement.
  • Collaborate with crossfunctional teams to ensure data quality and relevance for AI/ML models.
  • Work with data lake house architectures to optimize data storage and processing.
  • Integrate and optimize workflows using Snowflake and various vector store technologies.

Qualifications:

  • Masters degree in Computer Science Data Science or a related field.
  • 35 years of work experience in data engineering preferably in AI/ML contexts.
  • Proficiency in Python JSON HTTP and related tools.
  • Strong understanding of LLM architectures training processes and data requirements.
  • Experience with RAG systems knowledge base construction and vector databases.
  • Familiarity with embedding techniques similarity search algorithms and information retrieval concepts.
  • Handson experience with data cleaning tagging and annotation processes (both manual and automated).
  • Knowledge of data crawling techniques and associated ethical considerations.
  • Strong problemsolving skills and ability to work in a fastpaced innovative environment.
  • Familiarity with Snowflake and its integration in AI/ML pipelines.
  • Experience with various vector store technologies and their applications in AI.
  • Understanding of data lakehouse concepts and architectures.
  • Excellent communication collaboration and problemsolving skills.
  • Ability to translate business needs into technical solutions.
  • Passion for innovation and a commitment to ethical AI development.
  • Experience building LLMs pipeline using framework like LangChain LlamaIndex Semantic Kernel OpenAI functions.
  • Familiar with different LLM parameters like temperate topk and repeat penalty and different LLM outcome evaluation data science metrics and methodologies.

Preferred Skills:

  • Experience with popular LLM/ RAG frameworks.
  • Familiarity with distributed computing platforms (e.g. Apache Spark Dask).
  • Knowledge of data versioning and experiment tracking tools.
  • Experience with cloud platforms (AWS GCP or Azure) for largescale data processing.
  • Understanding of data privacy and security best practices.
  • Practical experience implementing data lakehouse solutions.
  • Proficiency in optimizing queries and data processes in Snowflake or Databricks.
  • Handson experience with different vector store technologies.

Benefits:

  • 401(k).
  • Dental Insurance.
  • Health insurance.
  • Vision insurance.
  • We are an equalopportunity employer and value diversity equality inclusion and respect for people.
  • The salary will be determined based on several factors including but not limited to location relevant education qualifications experience technical skills and business needs.

Additional Responsibilities:

  • Participate in OrangePeople monthly team meetings and participate in teambuilding efforts.
  • Contribute to OrangePeople technical discussions peer reviews etc.
  • Contribute content and collaborate via the OPWiki/Knowledge Base.
  • Provide status reports to OP Account Management as requested.

About us:
OrangePeople is an Enterprise Architecture and Project Management solutions company. Our most valuable asset is our people: dynamic creative thinkers who are passionate about doing quality work. As a member of the OrangePeople team you will have access to industryleading consulting practices strategies & and technologies innovative training & and education. An ideal Orange Person is a technology leader with a proven track record of technical achievements and a strong process/methodology orientation.

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.