drjobs Junior Medior Senior Site Reliability Engineer

Junior Medior Senior Site Reliability Engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Prague - Czech Republic

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description


About the company:

A reputable IT provider on the Czech and international market the company specializes in custom application development and consultancy services focusing on developing distribution portals in a cloud environment including the implementation of modern technologies such as GraphQL API creating specific software solutions such as simulators and developing frontend platforms for various purposes. Known for its autonomy precise work and deep technological knowledge the company provides expert consultancy services and organizes regular training for its employees and contractors. The team of experts has an extensive experience in various industries allowing it to effectively support and develop client IT systems.

About Project: For a private owned company that specializes in AIdriven talent marketplace solutions tailored for internal career advancement. Their platform facilitates internal mobility by connecting employees with relevant projects gigs mentorships and fulltime roles aligned with their interests career goals skills and experiences. The companys technology is utilized by various enterprises to democratize career development promote worker agility and cultivate a workforce prepared for the future. They are a fastgrowing and dynamic startup with 200 team members worldwide.

Contract: Fulltime hybrid (3 days onsite is a must) Freelancer Contract

Job Overview: The SRE will monitor troubleshoot and maintain our infrastructure with an emphasis on reliability scalability and automation. This role is ideal for candidates with foundational NOC experience who are interested in expanding their skills to include SRE practices and modern infrastructure management.

Responsibilities:

  1. Monitoring and Incident Response:

    • Continuously monitor application performance system health and network status.

    • Respond swiftly to incidents performing root cause analysis and implementing resolution strategies.

    • Escalate issues when necessary and lead the collaboration and communication for swift resolution of incidents.

  2. Automated Monitoring and Alerting:

    • Use and configure monitoring tools (e.g. Prometheus Grafana Coralogix Splunk) to improve visibility into system performance.

    • Develop and refine alerting rules to reduce noise and improve incident detection.

  3. Troubleshooting and System Maintenance:

    • Perform initial troubleshooting and diagnostics across application infrastructure and network layers.

    • Work with Developers and DevOps to implement fixes validate configurations and ensure systems are resilient to future incidents.

  4. Operational Automation:

    • Automate repetitive tasks such as alert handling system checks and routine maintenance.

    • Use scripting (e.g. Python Bash) and InfrastructureasCode (IaC) tools (e.g. Terraform Crossplane) to improve operational efficiency.

  5. Documentation and Knowledge Sharing:

    • Document processes incidents and troubleshooting steps maintaining a knowledge base for common issues.

    • Contribute to runbooks for automated troubleshooting and escalate complex issues to SREs or other technical teams.

  6. Continuous Improvement:

    • Analyze incidents and recurring issues to identify areas for improvement in system reliability and automation.

    • Lead postincident reviews and contribute insights for future preventive actions.



Requirements

Education: Bachelor s degree in Computer Science Information Technology or a related field or equivalent experience.

Experience:

  • 1 years in a NOC Technical Support or Junior SRE role with a focus on monitoring and system health.
  • Familiarity with cloud platforms (AWS Azure GCP) and containerized environments (Docker Kubernetes) is preferred.

Technical Skills:

  • Proficiency in system and application monitoring tools (e.g. Prometheus Grafana Coralogix etc).
  • Basic understanding of automation InfrastructureasCode (IaC) and scripting languages (Python Bash).
  • Foundational knowledge of networking Kubernetes and cloud services (AWS is an advantage).

Soft Skills:

  • Strong analytical skills with a proactive approach to problemsolving.
  • Excellent communication skills with the ability to work collaboratively across teams.
  • Eagerness to learn and adapt to new technologies with a growth mindset


Benefits


Interesting projects

Opportunities for further education

Flexibility in employment status: possibility to work as a permanent employee or as a contractor

Flexible working hours

Option for home office

Office located in the center of Prague excellent public transport accessibility

Company benefits and events

Great team atmosphere



Employment Type

Full Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.