Escalations and Maintenance Windows Are Critical to Downtime Response
Uptime.com includes several advanced check options to provide the flexibility organizations need in creating a response plan to downtime. Maintenance and planned downtime for patches and updates don’t typically create severe downtime events. With escalations, teams have an automated alert system that contacts designated senior-level personnel with relevant technical data.
Configuring checks with escalations and/or maintenance is fast, simple, and a critical part of any downtime response plan.
What Should Be in a Response Plan?
Speaking of response plans, do you have one?
How your organization responds to downtime depends on a variety of factors, including the size and location of your team.
But a basic response plan should include these four factors:
Monitoring crucial infrastructure includes pinging from external locations, and may utilize transaction or API monitoring. Synthetic monitoring systems provide a wealth of data on user experience. This also includes any existing internal monitoring software and tools your team uses in the process. For example, even if you use Uptime.com, your IT team may have alerts coming through integrations with popular software such as Slack or PagerDuty.
Once a problem is discovered, designated first responders use a predetermined process to address the issue. Uptime.com checks allow you to include notes to include on alerts for a particular check. You can find them on the Advanced tab when you create or edit a check. These notes allow you to include a checklist or instructions for next steps.
These first responders verify that the issue isn’t a false positive due to incorrect check configuration or work performed. For example, if your team deployed a patch, they could include instructions in the notes for what to do if the patch causes a site to crash.
Ideally, first responders are system admins with strong diagnostic skills that can use the information provided to immediately pinpoint the issue. If the problem is more serious, this is where escalation comes in.
The first responder or a senior-level staff team member corrects the problem. Also, team members evaluate what went wrong to determine if there’s something to do to prevent similar issues in the future.
Your response plan should be revised on a regular basis to ensure all possible scenarios are addressed.
In a perfect world, the first person to receive the alert is able to fix the issue. But this isn’t always the case.
What if diagnosis uncovers multiple issues? Secondly, what if the downtime occurs on a portion of infrastructure the first responder has no permissions to access?
That’s why Uptime.com includes escalations.
If a check remains down for an extended period of time, you can designate an additional contact to receive an automated escalation that includes alert data.
To create an escalation, edit your check and click on the Escalations tab. Add a time period for your escalation, choose the appropriate contact and save. It’s as simple as that.
You can create multiple escalations for each check. Before assigning contacts to an escalation, check to see if they are always available in the Contact screen. Contacts with an on-call schedule will not receive alerts that come in outside of their scheduled hours.
Setting Maintenance Windows
As one of our customers stated:
“I don’t want to wake up in the middle of the night with a notification that our website is down when everything is working fine.”
Planned downtime or emergency maintenance are common reasons for downtime alerts when no additional intervention is necessary.
To avoid interrupting someone’s dinner for a false alarm, create maintenance windows for checks monitoring infrastructure with routine patches or maintenance. Any alerts during scheduled maintenance are suppressed until the period ends.
You can add maintenance windows by editing a check and clicking on the Maintenance tab.
But what about emergency situations or unplanned maintenance?
Uptime.com allows you to turn off alerts temporarily so you can take care of what you need. Select Under Maintenance Now to temporarily turn off alerts. Just remember to go back in and turn them back on when you’re finished.
Avoid False Positives and Unresolved Downtime
Uptime.com provides a variety of advanced features to account for problems that often occur with basic check configurations.
By creating a plan to account for false positives and problems that can’t be resolved quickly, you’ll be better prepared for real world situations.
Your team won’t be unnecessarily interrupted, and the right people will receive alerts at the right time.
Minute-by-minute Uptime checks.
Start your 14-day free trial with no credit card required at Uptime.com.