Pyramid Consulting, Inc

Senior Site Reliability Engineer/ 100% Remote in Mexico

Job Location

México, Mexico

Job Description

Job Profile: Senior Site Reliability Engineer Job Type: Long-time based contract job Opportunity Location: 100% Remote in Mexico Job Description: About the Role We’re seeking a Senior Site Reliability Engineer (SRE) to join our cutting-edge Innovation Team developing the next wave of AI-powered SaaS solutions. In this high-impact role, you’ll support a multidisciplinary team—including Angular developers, Node.js engineers, and data scientists building on OpenAI and agentic AI architectures—with scalable, observable, and resilient infrastructure. You'll drive automation, release stability, and operational excellence in a fast-moving, production-grade environment. Key Responsibilities Architect and operate cloud-native infrastructure in Microsoft Azure, supporting scalable microservices and AI/ML workloads. Design and maintain CI/CD pipelines using Azure DevOps, GitHub Actions, and Terraform Cloud for full-stack services (Angular Node.js), data workflows, and ML deployments. Build and manage containerized environments using Docker, Helm, Flux, and AKS, leveraging infrastructure automation and GitOps practices. Support and scale MongoDB clusters and cloud-native data pipelines. Collaborate with data scientists developing solutions using OpenAI, LLMs, and agentic AI architectures—ensuring compute, orchestration, and observability are tightly integrated. Provide infrastructure support for Databricks-based ML development and experimentation environments. Implement observability and traceability using Dynatrace, with actionable monitoring, logging, and alerting strategies. Automate operational workflows using Python; contribute to DevOps tooling in Node.js. Own deployment workflows, production release coordination, and rollback readiness. Participate in an on-call rotation and provide after-hours/weekend support for production releases and urgent incidents. Collaborate with other SREs and architects to champion SRE best practices, including SLOs, SLIs, incident postmortems, and continuous reliability improvements. Required Qualifications 5 years of experience in SRE, DevOps, or Cloud Infrastructure roles with production support. Advanced experience with Microsoft Azure, including compute, networking, identity, and security. Expertise with CI/CD automation using Azure DevOps, GitHub Actions, and infrastructure as code tools like Terraform. Strong coding/scripting with Python; familiarity with Node.js. Deep knowledge of Docker, Helm, Flux, and AKS, and containerized architecture patterns. Operational experience with MongoDB, including performance tuning and HA strategies. Familiarity with Databricks and production ML pipelines. Proficiency in Dynatrace for monitoring, observability, and traceability. Experience supporting AI/LLM-based production workloads (e.g., OpenAI APIs, agentic AI systems). Availability for after-hours and weekend support during production incidents and releases. Preferred Qualifications Experience with MLOps, scalable model deployment, and model testing strategies. Familiarity with Angular deployment workflows and frontend observability. Exposure to event-driven architectures, serverless compute, or stream processing. Strong understanding of cloud security, compliance, and secrets management. Certifications in Azure or Kubernetes are a strong plus.

Location: México, MX

Posted Date: 5/16/2025

View More Pyramid Consulting, Inc Jobs

Contact Information

Contact	Human Resources Pyramid Consulting, Inc