NOC Team Lead
StackPath is cloud platform built at the internet’s edge, providing infrastructure and services physically closer to the source or destination of data than hyperscale cloud service providers. StackPath edge compute—including Virtual Machines and Containers—and edge applications—including CDN and WAF—are strategically located in the world’s most densely populated areas, and united by a secure private network backbone and a single management system. Customers ranging from Fortune 50 enterprises to one-person startups trust StackPath to give their latency-sensitive workloads and applications the speed, security, and efficiency they require. For more information, visit stackpath.com and follow StackPath at www.fb.com/stackpathllc and www.twitter.com/stackpath.
About the Role
The NOC Team Lead will provide leadership for a distributed team of NOC Engineers that is responsible for monitoring a global cloud infrastructure for several StackPath platforms. The NOC Team has end-to-end ownership of incident management and response, performing initial troubleshooting of production issues, escalation of problems, performing testing, and providing internal communication of issues during and after incident resolution.
This role with report to our: Platform Operations Manager
Essential Duties and Responsibilities
- Provide team leadership for the 24x7 NOC Team spanning multiple continents and time zones.
- Work closely with Shift Supervisors and Platform Operations Management to ensure adherence to standard policies and procedures for alert handling and incident response.
- Ensures training and communication of policies and procedures for the 24x7 NOC Team.
- Ensures all NOC team members have access to all tools needed to perform their duties and are trained on their use.
- Ensures shift handovers occur reliably and effectively according to industry best practices.
- Monitor and report on team KPIs and provide guidance to NOC team members to improve performance.
- Evaluate NOC procedures and policies on a continuous basis and provide recommendations for improvement to Platform Operations Management.
- Maintain documentation of technical procedures and playbooks used by the NOC.
- Track escalations to other teams and work with other team leads / managers to drive escalations to resolution.
- Track follow-up action items relating to platform incidents and work with other teams to drive these items to resolution.
- Participate in and organize projects involving the NOC team.
Desired Skills and Experience
- ITIL V4 Foundation or better.
- 3+ years working in a 24x7x365 high-availability environment.
- 5+ years working with Linux in a distributed server environment.
- 3+ years experience in a team leadership role.
- Some networking experience preferred.
- Exceptional written and verbal communication skills.
- Exceptional troubleshooting and problem-solving skills.
- Demonstrated ability to work remotely with a team.
- Demonstrated punctuality and reliability.
- Ability to work nights/weekends/holidays as needed and to participate in an on-call rotation for incident response.
This job description is not intended to be all-inclusive.
StackPath is an Equal Opportunity Employer. EOE/AA M/F/D/V
If your experience and qualifications match our current needs, a member of our human resources team will contact you. We look forward to hearing from you.