Job ID PROCLT0015
Job Title: Data Engineer
No. of Positions: 3
Location: Basking Ridge NJ
Client: VZ
Experience: 7 to 10 years
Working Model: Onsite Hybrid
Job Type: Contract Position
Job Overview
We at Procal are looking for a savvy Machine Learning & Data Engineer to join our team of analytics experts to help us extract value from our data. You will lead all the processes from data collection cleaning and preprocessing to training models and deploying them to production. On a high level we are looking for very handson engineers with good experience on big data data architecture machine learning and LLM.
The ideal candidate will be passionate about artificial intelligence and stay up to date with the latest developments in the field.
This position will be a combination of typical Data Scientist math and analytical skills with research advanced business communication and presentation skills.
Key Responsibilities
Develop big data scalable solutions using Hadoop Hive Spark MapReduce Java Python.
Design schema and data molding for NoSQL Database & Data Warehouse.
Develop ETL data flow and Cloud Integration to build reporting solutions.
Assemble large complex data sets that meet functional / nonfunctional requirements.
Identify design and implement internal process improvements: automating manual processes optimizing data delivery redesigning infrastructure for greater scalability etc.
Build the infrastructure required for optimal extraction transformation and loading of data from a wide variety of data sources using SQL and Spark big data technologies.
Designs develops codes and troubleshoots with consideration of upstream and downstream systems and technical implications.
Applies knowledge of tools within the Software Development Life Cycle toolchain to improve the value realized by automation.
Applies technical troubleshooting to break down solutions and solve technical problems of basic complexity.
Gathers analyzes and draws conclusions from large diverse data sets to identify problems and contribute to decisionmaking in service of secure stable application development.
Verifying data quality and/or ensuring it via data cleaning.
Exploring and visualizing data to gain an understanding of it then identifying differences in data distribution that could affect performance when deploying the model in the real world.
Understanding business objectives and developing models that help to achieve them along with metrics to track their progress.
Managing available resources such as hardware data and personnel so that deadlines are met.
Designing developing and researching Machine Learning systems models and schemes
Studying transforming and converting data science prototypes
Performing statistical analysis and using results to improve models.
Training and retraining ML systems and models as needed.
Analyzing the use cases of ML algorithms and ranking them by their success probability
Understanding when your findings can be applied to business decisions.
Enriching existing ML frameworks and libraries.
Build efficient pipeline to host LLM service in local machine.
Develop high scalable RAG system combining with LLM to serve daily analysis and troubleshooting.
Key Skill sets
Good Communication and presentation skills
Team player
Experience in R and/or Python required.
Proficiency with a deep learning framework such as TensorFlow or Keras.
Proficiency with Python and basic libraries for machine learning such as scikitlearn and pandas.
Expertise in visualizing and manipulating big datasets.
Good understanding of AI/ML stack GPUs MLFlow LLM models
Handson practical experience in Java Scala and/or Python system design application development testing and operational stability
Experience in developing debugging and maintaining code in a large corporate environment with one or more modern programming languages and database querying languages
Experience across the whole Software Development Life Cycle
Exposure to agile methodologies such as CI/CD Applicant Resiliency and Security
Emerging knowledge of software applications and technical processes within a technical discipline (e.g. cloud artificial intelligence machine learning mobile etc.
Knowledge of Unix shell and SQL as well as NoSQL DBs is required.
Experience with Linux Spark and Kafka.
Good understanding of Large Language Model from system engineering perspective.
Qualifications
MS or PhD in a relevant field (Computer Science Engineering Statistics Physics Applied Math)
5 years of experience with Python to analyze datasets train evaluate deploy and optimize models.
3 Experience with ML frameworks such as PyTorch TensorFlow or similar
3 years of machine learning/statistical modeling data analysis tools and techniques and parameters that affect their performance experience.
1 year experience working with technologies related to large language models including LLM architectures model evaluation adapters model customization including pretraining and finetuning techniques.
Proficient with design deployment and evaluation of LLMpowered agents and tools and orchestration approaches.
Proficient with prompt engineering embedding model fine tuning and retrieval method evaluation and optimization approaches.
Masters degree in a quantitative field such as statistics mathematics data science business analytics economics finance engineering or computer science
How to Apply
Interested candidates are invited to submit their resume & cover letter to
Procal Technologies Inc is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.