Site Reliability Engineer ID53670

Department: Engineering
Specialization: DevOps
Experience: Middle
Technologies: AWS
Client: Idera
Technical flow: DevOps AWS
Engineering technical flow: DevOps AWS
Non-engineering technical flow: none
  • What you will do

  • Monitor and support production and staging environments to ensure availability, performance, and stability;
  • Respond to incidents, perform triage and root cause analysis, and contribute to remediation efforts;
  • Participate in on-call rotations with defined SLAs;
  • Handle operational requests from internal teams;
  • Maintain and improve monitoring, alerting, dashboards, logs, and metrics;
  • Support CI/CD pipelines, production releases, and GitOps workflows;
  • Contribute to automation initiatives to reduce operational overhead;
  • Maintain and improve Kubernetes-based infrastructure and containerized workloads;
  • Support Infrastructure as Code practices and environment improvements.
  • Must haves

  • 2+ years of experience in Site Reliability Engineering, DevOps, or Production Operations;
  • Experience with AWS supporting production environments;
  • Experience supporting production SaaS applications;
  • Strong understanding of CI/CD systems (GitHub Actions, Jenkins, CircleCI);
  • Experience with GitOps and Git fundamentals;
  • Experience using GitHub, Jira, and Confluence;
  • Experience with Kubernetes (EKS, kOps or similar);
  • Experience with Docker and containerization;
  • Experience with observability tools (Grafana, Prometheus, Loki, PagerDuty);
  • Proficiency in scripting (Bash, Python, or Go);
  • Experience with Infrastructure as Code (Terraform, Helm);
  • Ability to work within structured operational processes and SLAs;
  • Strong written and verbal English communication skills;
  • Self-driven with a growth mindset.
  • Nice to haves

  • AWS certifications such as Solutions Architect, DevOps Engineer, or SysOps Administrator;
  • Experience with multi-tenant SaaS environments;
  • Experience working in globally distributed teams;
  • Familiarity with ChatOps practices;
  • Experience improving monitoring quality and reducing alert fatigue.

We are looking for a Middle SRE Operations Engineer to maintain reliability across a cloud-based SaaS platform. You’ll handle live incidents, improve observability, and reduce toil through automation using Kubernetes, Terraform, Grafana, and AWS. Hands-on, execution-focused, with real ownership across CI/CD pipelines, GitOps workflows, and on-call rotations.

The benefits of joining us

Professional growth

Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps

Competitive compensation

We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities

A selection of exciting projects

Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands

Flextime

Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.

Your AgileEngine journey starts here

1

2 min

Tell us about yourself

2

2 sec

Confirm requirements

3

30 - 60 min

Pass a short test

4

5 min

Record a short video

→ Introduce yourself on a video, instead of waiting for an interview

5

Live interview

Ace the technical interview with our team

→ Schedule a call yourself right away after your video is reviewed

6

Live interview

Final interview with your team

→ Get to know the team you will be working with

7

Get an offer

As quick as possible

Our geography

UTC-5
WASHINGTON DC USA
UTC-5
MIAMI USA
UTC-6
MEXICOMexico
UTC-5
ColombiaColombia
UTC-3
BrazilBrazil
UTC-3
ArgentinaArgentina
UTC+2
UkraineEurope
UTC+1
PolandEurope
UTC+0
PortugalPortugal
UTC+5:30
IndiaIndia

Apply for this position

Allowed Type(s): .pdf, .doc, .docx