Employment: Contract To Hire
Description: Data Science Associate
Remote
Skills:
24 years of experience of overall experience
Using AI tools: Large language models extraction models Azure AI studio models
Experience using LMLs
Extraction models
Python
Moving into data bricks
Soft Skills: Strong communication skills diversity of thought (experience outside of services maybe someone in manufacturing)
D2D:
Stand up meetings
Working through specific user stories related to generally AI type models
Meeting with internal customers to help understand the user experience/requirements
Working with those teams for feedback issues etc
Writing code
Updating
Promptto engineering
Exaction models
Customer: audit team tax team general accounting functions etc
This role constructs complex solutions that integrate data wrangling visualization and advanced modeling techniques into a seamless workflow using software development best practices in R Python or other scripting languages. The successful candidate comfortable working with APIs web sing SQL/noSQL databases cloudbased data solutions and AI/ML models including Large Language Models (LLMs). The role includes working within modern data platforms such as Databricks to develop and optimize scalable data pipelines and machine learning models.
Essential Job Functions
Expand Knowledge/Exposure: Develop service specific knowledge through greater exposure to peers internal experts clients regular selfstudy and formal training opportunities. Approach all problems and projects with a high level of professionalism objectivity and an open mind to new ideas and solutions.
Data Analysis & AI Solutions: Collaborate with internal teams to collect analyze and automate data processing. Leverage AI models including LLMs for developing intelligent solutions that enhance datadriven decisionmaking processes for both internal projects and external clients.
Data Development: Working under the guidance of a variety of Data Science team members gain exposure to developing custom data models and algorithms to apply to data sets. Gain experience with predictive and inferential analytics machine learning and artificial intelligence techniques. Use existing processes and tools to monitor and analyze solution performance and accuracy and communicate findings to team members and end users.
AIDriven Workflows: Contribute to automating business workflows by incorporating LLMs and other AI models to streamline processes and improve efficiency. Integrate AIdriven solutions within existing systems to provide advanced predictive capabilities and actionable insights.
Collaboration: With specific direction learn to work individually as well as in collaboration with others. Interaction with others will primarily be virtual with leadership and colleagues
Knowledge and expertise in data science and statistical computer languages (R Python SQL etc.) to manipulate data and draw insights from large data sets. Experience working with and creating data architectures or schemas. Demonstrated knowledge of machine learning and AI models including Large Language Models (LLMs) such as GPT Llama and Claude. Experience in finetuning deploying and maintaining these models in production environment as well as other statistical modeling techniques and their realworld advantages/drawbacks. Additionally a successful candidate will have experience in a combination of the following areas:
Understanding the domain specific nature of data being collected/analyzed and how data may be utilized to satisfy project objectives.
Ability to integrate vision models and extraction models into workflows to enhance data collection processing and insights in addition to leveraging LLMs and other AI models for predictive analytics and automation.
Experience in developing and optimizing data pipelines for machine learning and AI model training and evaluation. Familiarity with cloudbased services for scalable AI/ML deployments.
Experience with vision models and data extraction algorithms to automate processes such as document parsing object detection and structured data extraction from images or unstructured data.
Experience in data manipulation using tools like R Python or SQL with a focus on preparing data for machine learning applications. Ability to harmonize disparate data sources for use with AI/ML models.
Experience with a variety of machine learning models and dimension reduction techniques including but not limited to: linear/logistic regression and other generalized linear models tree based methods such as CARTs random forests boosting SVMs penalized methods such as ridge and LASSO (elastic nets) PCA tSNE clustering methods and other methods that can be applied to create predictive or inferential/descriptive models.
Ability to code in R Python SQL and other languages and experience with Databricks is a plus
Education (not required)
Bachelors degree in a field of Statistics Computer Science Economics Analytics or Data Science (e.g. Informatics Data Science Health Data Science).