Site Reliability Engineering is at the heart of modern digital services, ensuring systems stay up and performant 24/7. With businesses increasingly moving to cloud‑native architectures, skilled SREs are in high demand. This six‑month onsite role in Phoenix offers a fast‑track chance to apply your production‑support expertise in a banking‑focused environment.

Job Summary

We are seeking a hands‑on Site Reliability Engineer to monitor system health, troubleshoot production incidents, and drive automation across our cloud‑based services. The role involves building alerts, dashboards, and collaborating with development teams to improve reliability while participating in on‑call rotations.

Top 3 Critical Skills Table

Skill	Why it's critical	Mastery Level
Core Java	Foundation for building and debugging services in Spring Boot	Senior
Monitoring (Splunk/Kibana/Grafana)	Provides real‑time visibility and rapid incident response	Senior
Cloud Platforms (AWS/Azure/GCP)	Enables scalability, reliability, and automation in production	Senior

Interview Preparation

How do you design an alerting strategy to minimize noise while ensuring critical incidents are caught?
What the interviewer is looking for: Understanding of threshold setting, severity levels, and use of tools like Splunk or Grafana.
Explain a time you performed a root‑cause analysis on a production outage. What steps did you take?
What the interviewer is looking for: Structured troubleshooting methodology, documentation, and collaboration with dev teams.
Describe how you would implement a CI/CD pipeline for a Spring Boot microservice.
What the interviewer is looking for: Familiarity with build tools, automated testing, and deployment orchestration.
What are the key differences between L1 and L2 support, and how do you transition an issue between them?
What the interviewer is looking for: Clear delineation of responsibilities, escalation procedures, and communication skills.
How would you automate the creation of monitoring dashboards for a new service in Grafana?
What the interviewer is looking for: Use of templating, API integration, and infrastructure‑as‑code concepts.

Resume Optimization

Site Reliability Engineer
Production Support
Core Java
Splunk
Kibana
Grafana
PostgreSQL
MongoDB
ServiceNow
CI/CD

Application Strategy

When reaching out to the recruiter, send a concise email that starts with a friendly greeting, attaches your updated resume, and clearly maps your experience to the role. Highlight your top skills—such as Core Java, monitoring with Splunk/Kibana/Grafana, and cloud automation—and reference any relevant projects where you reduced downtime or automated incident response. Mention that you’re eager to discuss how your background aligns with the team’s reliability goals.

Career Roadmap

Current Role	Typical Experience	Core Focus	Next Position
Site Reliability Engineer	2‑4 years	Incident response, automation, monitoring	Senior Site Reliability Engineer
Senior Site Reliability Engineer	4‑7 years	Architecture, large‑scale reliability, mentorship	SRE Lead / Reliability Architect
SRE Lead / Reliability Architect	7+ years	Strategy, cross‑team leadership, budgeting	Director of Reliability Engineering

Site Reliability Engineer (SRE)

Job Description & Details