Cloud Incident Manager
Tungsten Automation
Cloud Incident Manager
Tracking Code
Job Location
Job Level
Category
Position Type
The Cloud Incident Manager role exists to lead incidident response and resolution efforts for our cloud delivered services. This role is critical to improving service reliability, minimizing downtime and driving continuous improvement in the incident management processes as operated by our service teams.
Key responsibilities
• Own and develop the incident management, major incident management and directly related service management processes
• Work with service team leads to ensure a consistent incident response across the Cloud Services organization
• Ensure appropriate and meaningful external communication through the Tungsten cloud status page to service subscribers
• Set standards for and support effective internal and external stakeholder communication during and in the aftermath of major incidents
• Ensure Post Incident Reviews (PIRs) are conducted and documented for major incidents to identify root cause and preventative measures
• Ensure PIR identified preventative measures are linked to backlog tasks for Cloud Serivces and where appropriate, problem records created to track changes by this and other teams
• Ensure hierarchical escalation and notification occurs as necessary during incident resolution
• Support coordination of response efforts between Cloud Serivces teams and with teams outside of the Cloud Services organization
• Identify opportunities proactively to improve response times and reliability
• Work with service team leads and the Cloud ITSM System Administrator to refine and improve supporting workflow in our ITSM tools, such as Atlassian Jira
• Work with Sr Director Cloud Service Delivery to produce high quality and insightful metrics based reporting for service reviews and other purposes
Required Experience
Qualifications & Skills
• Proven experience in an international provider of SaaS service offerings
• Proven experience of working with ITIL derived service management processes
• Proven experience of managing and improving incident management processes
• Proven experience of working with and developing ITSM tools, ideally Atlassian Jira
• Expert at incident and major incident workflows
• Expert at root cause analysis and post incident review/post mortem exercises
• Good understanding of security incident and compliance/control frameworks
• ITIL certification to advanced levels highly desirable
• Excellent written and oral skills including English language proficiency