Smartwork IT Services

Observability Engineer - Monitoring/Log Management Tools

Job Location

hyderabad, India

Job Description

Job Title : Observability Engineer Location : Hyderabad Experience : 5-10 Years Job Description We are looking for a highly skilled Observability Engineer to design, develop, and maintain observability solutions that provide deep visibility into our infrastructure, applications, and services. You will be responsible for implementing monitoring, logging, and tracing solutions to ensure the reliability, performance, and availability of our systems. Working closely with development, Infra Engineers, DevOps, and SRE teams, you will play a critical role in optimizing system observability and improving incident response. Key Responsibilities : - Design and implement observability solutions for monitoring, logging, and tracing across cloud and on-premises environments. - Develop and maintain monitoring tools such as Prometheus, Grafana, Datadog, New Relic, and AppDynamics. - Implement distributed tracing using OpenTelemetry, Jaeger, Zipkin, or similar tools to improve application performance and troubleshooting. - Optimize log management and analysis with tools like Elasticsearch, Splunk, Loki, or Fluentd. - Create alerting and anomaly detection strategies to proactively identify system issues and reduce mean time to resolution (MTTR). - Collaborate with development and SRE teams to enhance observability in CI/CD pipelines and microservices architectures. - Automate observability processes using scripting languages like Python, Bash, or Golang. - Ensure scalability and efficiency of monitoring solutions to handle large-scale distributed systems. - Support incident response and root cause analysis by providing actionable insights through observability data. - Stay up to date with industry trends in observability and site reliability engineering (SRE). Required Qualifications : - 5 years of experience in observability, SRE, DevOps, or a related field. - Proficiency in observability tools such as Prometheus, Grafana, Datadog, New Relic, or AppDynamics. - Experience with logging platforms like Elasticsearch, Splunk, Loki, or Fluentd. - Strong knowledge of distributed tracing (OpenTelemetry, Jaeger, Zipkin). - Hands-on experience with Azure cloud platforms and Kubernetes. - Proficiency in scripting languages (Python, Bash, PowerShell) and infrastructure as code (Terraform, Ansible). - Solid understanding of system performance, networking, and troubleshooting. - Strong problem-solving and analytical skills. - Excellent communication and collaboration abilities. Preferred Qualifications : - Experience with AI-driven observability and anomaly detection. - Familiarity with microservices, serverless architectures, and event-driven systems. - Experience working with on-call rotations and incident management workflows. - Relevant certifications in observability tools, cloud platforms, or SRE practices. (ref:hirist.tech)

Location: hyderabad, IN

Posted Date: 5/9/2025
View More Smartwork IT Services Jobs

Contact Information

Contact Human Resources
Smartwork IT Services

Posted

May 9, 2025
UID: 5182255542

AboutJobs.com does not guarantee the validity or accuracy of the job information posted in this database. It is the job seeker's responsibility to independently review all posting companies, contracts and job offers.