$2,247.00 Fixed
TechFlow Inc
Contract · Flexible hours
About the role
We are looking for a Senior Site Reliability Engineer to help stabilize and scale our core services. The project focuses on improving reliability, performance, and observability of a high‑traffic SaaS platform.
Key responsibilities
- Design and implement automated monitoring, alerting, and incident response workflows.
- Develop and maintain CI/CD pipelines for infrastructure changes.
- Optimize cloud resources on AWS for cost, performance, and resilience.
- Implement IaC using Terraform and manage Kubernetes clusters.
- Troubleshoot production incidents and lead post‑mortem analyses.
- Collaborate with development teams to embed reliability best practices.
Must-have skills
- Deep experience with Linux system administration.
- Proficiency in Docker and Kubernetes orchestration.
- Strong knowledge of AWS services and networking.
- Expertise in Terraform or similar IaC tools.
- Solid background in monitoring, logging, and alerting solutions.
Nice to have
- Experience with service mesh technologies.
- Familiarity with chaos engineering practices.
- Proposal: 0
- Less than 3 month
Sam Vasquez
,
Member since
Oct 27, 2025
Total Job