Site Reliability Engineer

Posted on August 4, 2025

Apply Now

Job Description

  • Site Reliability Engineer
  • Experience: 7+
  • Budget: 1.1L
  • Remote
  • ? Technical Skills:
  • ? Programming: Proficiency in languages like Python, Bash, or Java is essential.
  • ? Operating Systems: Deep understanding of Linux/Windows operating systems
  • and networking concepts.
  • ? Cloud Technologies: Experience with AWS & Azure including services,
  • architecture, and best practices.
  • ? Containerization and Orchestration: Hands-on experience with Docker,
  • Kubernetes, and related tools.
  • ? Infrastructure as Code (IaC): Familiarity with tools like Terraform,
  • CloudFormation or Azure CLI.
  • ? Monitoring and Observability: Experience with tools like Splunk, New Relic or
  • Azure Monitoring.
  • ? CI/CD: Experience with continuous integration and continuous delivery pipelines,
  • GitHub, GitHub Actions.
  • ? Knowledge in supporting Azure ML, Databricks and other related SAAS tools.
  • ? Soft Skills:
  • ? Problem-Solving: Ability to troubleshoot and debug complex distributed systems
  • independently.
  • ? Communication: Strong written and verbal communication skills to collaborate
  • with development and operations teams, and able to write documentation like
  • Runbook etc.
  • ? Specific Experience:
  • ? Incident Management: Experience with incident response, root cause analysis,
  • and post-incident reviews.
  • ? Scalability and Performance: Understanding of scalability, availability, and
  • performance monitoring for large-scale systems.
  • ? Automation: Experience in automating repetitive tasks and workflows.
  • ? Preferred Qualifications:
  • ? Experience with specific cloud platforms (AWS, Azure).
  • ? Certifications related to cloud engineering or DevOps.
  • ? Experience with microservices architecture including supporting AI/ML solutions.
  • ? Experience with large-scale system management and configuration.

Required Skills

No specific skills listed.