Senior Site Reliability Engineer ID61984

Department: Engineering
Specialization: Developer
Experience: Senior
Locations: India
Client: Indeed
Technical flow: JAVA
Engineering technical flow: JAVA
Non-engineering technical flow: none
  • What you will do

  • Design, build, and maintain scalable backend and platform components;
  • Implement and manage observability solutions across distributed systems;
  • Configure dashboards, alerts, and APM for tracing, metrics, and logging;
  • Monitor and improve system reliability, scalability, and performance;
  • Deploy, operate, and maintain services in Kubernetes environments;
  • Integrate observability tools into CI/CD pipelines and cloud infrastructure;
  • Automate monitoring and operational workflows using scripting;
  • Provide operational and training support for observability platforms, especially Datadog;
  • Collaborate with engineering teams to improve system visibility and reliability practices.
  • Must haves

  • 4+ years of experience with Python, Node.js, or Java;
  • Hands-on experience with API integrations;
  • Strong experience in Kubernetes environments;
  • Experience with Datadog or similar tools such as Prometheus and Grafana;
  • Ability to configure dashboards, alerts, and APM;
  • Experience monitoring containerized and microservices architectures;
  • Hands-on experience with AWS;
  • Experience integrating observability tools into cloud environments;
  • Experience with CI/CD integrations for observability;
  • Ability to automate monitoring and operational tasks using scripting;
  • Upper-intermediate English level.
  • Nice to haves

  • Experience owning and operating an internal engineering platform, especially observability platforms;
  • Demonstrated ownership of reliability, scalability, and performance;
  • Ability to proactively lead maintenance and platform improvements;
  • Experience installing and configuring Datadog agents and integrations;
  • Experience managing API keys and secure configurations;
  • Experience managing user roles and access controls;
  • Familiarity with Go (Golang);
  • Experience with additional observability tools such as New Relic, Dynatrace, Elastic Stack, or Splunk.

We are looking for a Senior Site Reliability Engineering to strengthen our platform reliability and observability capabilities. You will own the design and operation of monitoring infrastructure — including Datadog APM, alerting, and distributed tracing — across Kubernetes-based microservices on AWS. The role spans backend engineering and SRE practice in roughly a 65/35 split, with direct involvement in CI/CD integration and observability automation. You will also support internal teams in adopting monitoring best practices as we modernize our R&D platform.

The benefits of joining us

Professional growth

Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps

Competitive compensation

We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities

A selection of exciting projects

Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands

Flextime

Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.

Your AgileEngine journey starts here

1

2 min

Tell us about yourself

2

2 sec

Confirm requirements

3

30 - 60 min

Pass a short test

4

5 min

Record a short video

→ Introduce yourself on a video, instead of waiting for an interview

5

Live interview

Ace the technical interview with our team

→ Schedule a call yourself right away after your video is reviewed

6

Live interview

Final interview with your team

→ Get to know the team you will be working with

7

Get an offer

As quick as possible

Our geography

UTC-5
WASHINGTON DC USA
UTC-5
MIAMI USA
UTC-6
MEXICOMexico
UTC-5
ColombiaColombia
UTC-3
BrazilBrazil
UTC-3
ArgentinaArgentina
UTC+2
UkraineEurope
UTC+1
PolandEurope
UTC+0
PortugalPortugal
UTC+5:30
IndiaIndia

Apply for this position

Allowed Type(s): .pdf, .doc, .docx