Site Reliability Engineer
Posted on August 4, 2025
Job Description
- Site Reliability Engineer
- Experience: 7+
- Budget: 1.1L
- Remote
- ? Technical Skills:
- ? Programming: Proficiency in languages like Python, Bash, or Java is essential.
- ? Operating Systems: Deep understanding of Linux/Windows operating systems
- and networking concepts.
- ? Cloud Technologies: Experience with AWS & Azure including services,
- architecture, and best practices.
- ? Containerization and Orchestration: Hands-on experience with Docker,
- Kubernetes, and related tools.
- ? Infrastructure as Code (IaC): Familiarity with tools like Terraform,
- CloudFormation or Azure CLI.
- ? Monitoring and Observability: Experience with tools like Splunk, New Relic or
- Azure Monitoring.
- ? CI/CD: Experience with continuous integration and continuous delivery pipelines,
- GitHub, GitHub Actions.
- ? Knowledge in supporting Azure ML, Databricks and other related SAAS tools.
- ? Soft Skills:
- ? Problem-Solving: Ability to troubleshoot and debug complex distributed systems
- independently.
- ? Communication: Strong written and verbal communication skills to collaborate
- with development and operations teams, and able to write documentation like
- Runbook etc.
- ? Specific Experience:
- ? Incident Management: Experience with incident response, root cause analysis,
- and post-incident reviews.
- ? Scalability and Performance: Understanding of scalability, availability, and
- performance monitoring for large-scale systems.
- ? Automation: Experience in automating repetitive tasks and workflows.
- ? Preferred Qualifications:
- ? Experience with specific cloud platforms (AWS, Azure).
- ? Certifications related to cloud engineering or DevOps.
- ? Experience with microservices architecture including supporting AI/ML solutions.
- ? Experience with large-scale system management and configuration.
Required Skills
No specific skills listed.