Skip navigation EPAM

Site Reliability Engineer (SRE) Pune, India

  • hot

Site Reliability Engineer (SRE) Description

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

We are seeking a talented, motivated and experienced Site Reliability Engineer (SRE) to join our Organization.

The SRE will play a crucial role in ensuring the reliability, scalability, capacity planning, and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, containerization, and cloud technologies.


#LI-DNI

Responsibilities

  • Monitor system performance and proactively troubleshoot issues to ensure high availability and performance
  • Implement and manage continuous integration and deployment pipelines
  • Design, develop, and maintain scalable, automated, and resilient infrastructure solutions
  • Participate in incident management, root cause analysis, and implementation of remediation plans
  • Collaborate with development teams to enhance the operability of systems
  • Define and track key metrics and Service Level Objectives (SLOs) to improve system stability and performance

Requirements

  • 3 to 5 years of experience in a site reliability engineering role
  • Proficiency in scripting and programming languages such as Python, Bash, or PowerShell
  • Expertise in automation tools including Jenkins, GitLab, and Ansible or Chef for configuration management
  • Familiarity with observability tools such as Grafana, Splunk, and Dynatrace
  • Background in containerization and orchestration technologies like Docker and Kubernetes
  • Understanding of SLI, SLO, SLA, and Error Budget concepts
  • Capability to provide on-call support and participate in incident management and response activities

We offer

  • Opportunity to work on technical challenges that may impact across geographies
  • Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
  • Opportunity to share your ideas on international platforms
  • Sponsored Tech Talks & Hackathons
  • Unlimited access to LinkedIn learning solutions
  • Possibility to relocate to any EPAM office for short and long-term projects
  • Focused individual development
  • Benefit package:
    • Health benefits
    • Retirement benefits
    • Paid time off
    • Flexible benefits
  • Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)

Hello. How Can We Help You?

Our Offices