drjobs Lead Data Engineer

Lead Data Engineer

Employer Active

The job posting is outdated and position may be filled
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Austin, MN - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Job Description

Who we are

Artmac Soft is a technology consulting and serviceoriented IT company dedicated to providing innovative technology solutions and services to Customers.

Job Description:

Job Title : Lead Data Engineer

Job Type : W2

Experience : 515 Years

Location : Austin Texas

We are looking for a Lead Data Engineer who will be responsible for implementing and scaling data collection storage processing and filtering for finetuning large language models (LLMs) within Conversational Engineering. They will be critical for enabling cuttingedge research safety systems and product development.

Responsibilities:

  • Experience as a data engineer with a strong background in designing and building largescale data pipelines.
  • Must have knowledge on Python SQL Big Data Pyspark and Data engineering.
  • Have extensive experience with cloud platforms like AWS Google Cloud or Azure for data storage processing and management.
  • Have handson experience with ETL orchestration tools such as Apache Airflow Dagster or Prefect for managing complex data workflows.
  • Possess deep expertise in distributed computing frameworks such as Apache Spark Hadoop or Flink and have handson experience optimizing data processing at scale.
  • Are proficient in programming languages commonly used in data engineering such as Python and have a solid understanding of data structures and algorithms.
  • Are wellversed in various data storage technologies including distributed file systems (e.g. HDFS S3) databases (e.g. Cassandra HBase) and data warehouses (e.g. Redshift BigQuery).
  • Possess knowledge of natural language processing (NLP) techniques and have worked with text data preprocessing normalization and feature extraction.
  • Are passionate about staying uptodate with the latest advancements in data engineering and NLP and are eager to apply innovative techniques to solve challenging problems.
  • Design build and manage scalable data pipelines for collecting storing processing and filtering large volumes of text data for finetuning LLMs.
  • Develop and optimize data storage architectures to handle the massive scale of data required for training stateoftheart language models.
  • Implement efficient data preprocessing cleaning and feature extraction techniques to ensure highquality data for model training.
  • Collaborate with machine learning engineers and researchers to understand their data requirements and provide tailored solutions for LLM finetuning.
  • Design and implement robust and faulttolerant systems for data ingestion processing and delivery.
  • Optimize data pipelines for performance scalability and costefficiency leveraging distributed computing frameworks and cloud platforms.
  • Ensure the security privacy and compliance of data according to industry best practices and regulatory requirements.

Qualification:

  • Bachelors degree or equivalent combination of education and experience

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.