Nice interactive solutions India pvt ltd
NICE - Site Reliability Engineer - DevOps
Job Location
pune, India
Job Description
Objectives of this Role : - Run the production environment by monitoring availability and taking a holistic view of system health - Build software and systems to manage platform infrastructure and applications - Improve reliability, quality, and time-to-market of our suite of software solutions - Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve - Provide primary operational support and engineering for multiple large, distributed software applications Daily and Monthly Responsibilities : - Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding - Partner with development teams to improve services through rigorous testing and release procedures - Participate in system design consulting, platform management, and capacity planning - Create sustainable systems and services through automation and uplifts - Balance feature development speed and reliability with well-defined service level objectives Required Skills and Qualifications : - Bachelors degree in computer science, Engineering, or related field (or equivalent experience). - 8-10 years of working experience in a similar role, with a focus on systems engineering, automation, and reliability. - Proficiency in at least one programming language (e.g., Python, Go, Java, C#) and experience with scripting languages (e.g., Bash, PowerShell). - Deep understanding of cloud computing platforms (e.g., AWS), the working and reliability constraints of some of the prominent services (e.g., EC2, ECS, Lambda, DynamoDB etc) - Experience with infrastructure as code tools such as CloudFormation, Terraform. - Deep understanding of CI/CD concepts and experience with CI/CD tools such as Jenkins, GitLab CI/CD, or CircleCI. - Strong knowledge of containerization technologies (e.g., Docker, Kubernetes) and microservices architecture. - Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Cloudwatch). - Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems. - Experience of Incident management and blameless postmortems that includes driving the incident response efforts during outages and other critical incidents, resolution, and communication in a cross-functional team setup. Good to have skills : - Hands-on experience of working with large Kubernetes Cluster. Certification will be an added plus. - Working experience of Grafana Observability Suite (Loki, Mimir, Tempo). - Administration and/or development experience of standard monitoring and automation tools such as Splunk, Datadog, Pagerduty Rundeck. - Familiarity with configuration management tools like Ansible, Puppet, or Chef. - Certifications such as AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer, or equivalent. (ref:hirist.tech)
Location: pune, IN
Posted Date: 5/1/2025
Location: pune, IN
Posted Date: 5/1/2025
Contact Information
Contact | Human Resources Nice interactive solutions India pvt ltd |
---|