Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailNot Disclosed
Salary Not Disclosed
1 Vacancy
Role: Java Developer with Web Crawler Experience
Location: Austin TX(Hybrid)
Responsibilities:
1. Web Crawler Development: Design and implement efficient and scalable web crawlers in Java to collect data from various online sources.
2. Data Extraction: Develop and maintain systems for structured data extraction handling various data formats (HTML JSON XML etc.).
3. Data Storage and Processing: Design data storage and processing pipelines ensuring extracted data is clean structured and easily accessible.
4. Performance Optimization: Optimize web crawling processes for speed efficiency and accuracy while ensuring minimal impact on source websites.
5. Error Handling and Logging: Implement errorhandling mechanisms and logging systems to detect and resolve issues during crawling operations.
6. Data Integrity and Compliance: Ensure data collection practices are ethical legal and compliant with relevant regulations (e.g. robots.txt copyright laws).
Requirements:
Proficiency in Java and experience with Javabased web sing libraries (e.g. Jsoup Apache HttpClient).
Knowledge of web crawling frameworks and tools such as Sy Selenium or Puppeteer.
Strong understanding of HTML CSS JavaScript and web data structures.
Familiarity with data parsing and handling techniques for JSON XML and other common formats.
Experience with database technologies (SQL NoSQL) to store and manage sed data.
Knowledge of HTTP protocols headers proxies and load handling.
Full Time