HEWLETT PACKARD ENTERPRISE GLOBALSOFT PVT LTD
Hewlett Packard Enterprise - HPC Service Line Manager
Job Location
bangalore, India
Job Description
HPC Service Line Manager (SLiM) This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week from an HPE office. Who We Are : Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today's complex world. Our culture thrives on finding new and better ways to accelerate what's next. We know diverse backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE. Job Description : HPE Operations is our innovative IT services organization. It provides the expertise to advise, integrate, and accelerate our customers' outcomes from their digital transformation. Our teams collaborate to transform insight into innovation. In today's fast paced, hybrid IT world, being at business speed means overcoming IT complexity to match the speed of actions to the speed of opportunities. Deploy the right technology to respond quickly to market possibilities. Join us and redefine what's next for you. What you'll do : We are seeking an Service Line Manager to lead and manage the High-Performance Computing (HPC) team. The ideal candidate will be responsible for overseeing the operations, maintenance, and enhancement of HPC infrastructure while ensuring high availability, performance, and scalability. This role involves team management, stakeholder coordination, incident resolution, and continuous improvement initiatives to optimize HPC environments. Responsibilities : - Review and Validate HPC solutions and Environment through POCs and Benchmarking - Architecting and designing HPC solutions tailored to the customer's needs. - Overseeing solution implementation, integration and testing. - Diagnose and correct solution issues during the implementation. - Providing training, documentation and ongoing support. - Maintain the Life-cycle management of the HPC environment. - Oversee the team operations and deliverables. - Lead the team with technical expertise ensure regular technical session and case reviews - Demonstrate high level of technical & communication skills under critical situations - Takes responsibility for end-to-end problem ownership and its solutions. - Should be a good team player Project Management & Process Improvements : - Drive initiatives to enhance HPC infrastructure efficiency, scalability, and cost-effectiveness. - Work with cross-functional teams to integrate emerging technologies into the HPC ecosystem. - Define and track key performance metrics (KPIs) for HPC systems and team productivity. - Identify opportunities for process automation and infrastructure optimization. HPC Infrastructure & Support : - Oversee the administration of HPC clusters, storage systems, schedulers (e.g., Slurm, PBS, LSF), and software stacks. - Collaborate with vendors for hardware/software upgrades, patches, and troubleshooting. - Ensure compliance with security policies, data protection regulations, and best practices. - Implement automation and optimization strategies for HPC resource utilization. - Support large-scale simulation, AI/ML workloads, and parallel computing applications. Technical Skills : - Strong knowledge of HPC architecture, parallel computing, and distributed systems. - Experience managing HPC clusters, storage, and job schedulers (e.g., Slurm, PBS, LSF). - Strong hands-on with large storage systems (Scality, DDN, Lustre, GPFS and WEKA FS) - Hands-on experience with networking, high-speed interconnects (InfiniBand, RDMA). - Cluster manager: HPE HPCM, NVIDIA Bright cluster manager Experience managing HPC clusters with GPUs. - Operating Systems: Linux - RHEL, SLES, Ubuntu, Rocky Linux Server management: HPE Oneview, ILO/BMC - Scripting Languages : Bash, Python, Powershell - Infrastructure Monitoring : Nagios, OpsRamp, HPE PCM, NVIDIA BCM, Solar Winds - Virtualization : Containers, Kubernetes, Vmware and OpenShift In depth knowledge on HPE Hardware Platforms DL, BL, Synergy, Moonshot, Apollo Servers, Cray servers and Compute Familiarity with Linux system administration, scripting (Bash, Python), and automation tools What you need to bring : - Bachelor's/Master's degree in Computer Science, IT, or a related field. - 7 years of experience in HPC, with at least 2 years in a leadership/managerial role. - Industry certifications (Red Hat, NVIDIA, AWS HPC, etc.) are a plus Business Skills : - Demonstrate strong written and verbal communication skills. - Interacting and collaborating across different technology teams within HPE. - Must work towards achieving HPE's vision for our customers. - Affinity and a thorough understanding of support processes defined within HPE. - Ability to work in a 24x7 environment in rotation shifts - Exhibit "Customer First and Customer Last Attitude" consistently. - Ability to drive cases to closure and provide Case Summary. - Demonstrate high level of technical & communication skills. - Takes responsibility for end-to-end problem ownership and its solutions. Good to have knowledge on : - Database : MySQL, MariaDB - Web Servers : Linux apache2, Apache Tomcat - NVIDIA AI Enterprise Suite - Understanding of AI/ML workflows (ref:hirist.tech)
Location: bangalore, IN
Posted Date: 5/2/2025
Location: bangalore, IN
Posted Date: 5/2/2025
Contact Information
Contact | Human Resources HEWLETT PACKARD ENTERPRISE GLOBALSOFT PVT LTD |
---|