Role Description [Exp 5+ years]
This role is for a DevOps & site reliability engineer to own the cloud infrastructure, build and release process with monitoring capabilities.
Responsibilities:
-
- Own and maintain cloud infrastructure
- Design, implement, and maintain CI/CD pipelines for backend, frontend, and AI/ML components
- Manage containerized workloads (Docker, Kubernetes, or similar)
- Monitor and improve system performance, availability, and security
- Work closely with engineers and PMs to enable fast, safe feature delivery
- Automate repetitive tasks and reduce operational overhead
- Participate in incident response, root cause analysis, and reliability improvements
Required Skills / Qualifications:
-
- 5+ years of experience in DevOps & Site Reliability Engineering
- Hands-on experience with Cloud infrastructure , CI/CD pipelines and automation, Containerization (Docker) and orchestration (Kubernetes)
- Monitoring and alerting tools (Prometheus, Grafana, ELK, etc.)
- Strong scripting skills (Python, Bash, etc.)
- Understanding of infrastructure as code (Terraform, CloudFormation, etc.)
- Ability to balance speed, reliability, and cost
- Comfortable in a fast-paced, evolving environment