Department: Engineering

Specialization: Developer

Experience: Senior

Technologies: AWS Java Kubernetes Node.js Python

Locations: India

Client: Indeed

Technical flow: JAVA

Engineering technical flow: JAVA

Non-engineering technical flow: none

What you will do
Design, build, and maintain scalable backend and platform components;
Implement and manage observability solutions across distributed systems;
Configure dashboards, alerts, and APM for tracing, metrics, and logging;
Monitor and improve system reliability, scalability, and performance;
Deploy, operate, and maintain services in Kubernetes environments;
Integrate observability tools into CI/CD pipelines and cloud infrastructure;
Automate monitoring and operational workflows using scripting;
Provide operational and training support for observability platforms, especially Datadog;
Collaborate with engineering teams to improve system visibility and reliability practices.

Must haves
4+ years of experience with Python, Node.js, or Java;
Hands-on experience with API integrations;
Strong experience in Kubernetes environments;
Experience with Datadog or similar tools such as Prometheus and Grafana;
Ability to configure dashboards, alerts, and APM;
Experience monitoring containerized and microservices architectures;
Hands-on experience with AWS;
Experience integrating observability tools into cloud environments;
Experience with CI/CD integrations for observability;
Ability to automate monitoring and operational tasks using scripting;
Upper-intermediate English level.

Nice to haves
Experience owning and operating an internal engineering platform, especially observability platforms;
Demonstrated ownership of reliability, scalability, and performance;
Ability to proactively lead maintenance and platform improvements;
Experience installing and configuring Datadog agents and integrations;
Experience managing API keys and secure configurations;
Experience managing user roles and access controls;
Familiarity with Go (Golang);
Experience with additional observability tools such as New Relic, Dynatrace, Elastic Stack, or Splunk.

We are looking for a Senior Site Reliability Engineering to strengthen our platform reliability and observability capabilities. You will own the design and operation of monitoring infrastructure — including Datadog APM, alerting, and distributed tracing — across Kubernetes-based microservices on AWS. The role spans backend engineering and SRE practice in roughly a 65/35 split, with direct involvement in CI/CD integration and observability automation. You will also support internal teams in adopting monitoring best practices as we modernize our R&D platform.

61984

About the role

The benefits of joining us

Professional growth

Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps

Competitive compensation

We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities

A selection of exciting projects

Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands

Flextime

Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.

Your AgileEngine journey starts here

2 min

Tell us about yourself

2 sec

Confirm requirements

30 - 60 min

Pass a short test

5 min

Record a short video

→ Introduce yourself on a video, instead of waiting for an interview

Live interview

Ace the technical interview with our team

→ Schedule a call yourself right away after your video is reviewed

Live interview

Final interview with your team

→ Get to know the team you will be working with

Get an offer

As quick as possible

Our geography

UTC-5

WASHINGTON DC USA

UTC-5

MIAMI USA

UTC-6

MEXICOMexico

UTC-5

ColombiaColombia

UTC-3

BrazilBrazil

UTC-3

ArgentinaArgentina

UTC+2

UkraineEurope

UTC+1

PolandEurope

UTC+0

PortugalPortugal

UTC+5:30

Senior Site Reliability Engineer ID61984

What you will do

Must haves

Nice to haves

About the role

The benefits of joining us

Your AgileEngine journey starts here

Our geography

WASHINGTON DC USA

MIAMI USA

MEXICOMexico

ColombiaColombia

BrazilBrazil

ArgentinaArgentina

UkraineEurope

PolandEurope

PortugalPortugal

IndiaIndia

Apply for this position