Top 10 Questions About Uptime Monitoring
Monitoring for uptime is becoming increasingly necessary as SaaS and Always-On services integrate deeper with our professional and personal lives. When bottom lines and infrastructure requirements are tied so closely to 24/7 accessibility, making sure your websites are UP becomes priority one.
We’ve scoured our support tickets, talked to our users, and kept an ear to the ground to compile the top 10 questions surrounding uptime monitoring and break down the answers.
1. What is uptime monitoring?
What kind of website monitoring provider would we be if we couldn’t answer this one? Monitoring for uptime means you’re monitoring how many minutes a year your business is online. But that’s the surface answer. When we look at uptime, we want a few key performance metrics:
- Total uptime percentages
- Target uptime SLA %
- Target SLA response times and overall response times
Essentially, we like to monitor the actual state of availability and accessibility of your site and combine those with metrics on your target performance. To do this effectively we look at performance, overall health (with easy-to-use tools like our Monitor Entire Site feature), and downtime – when an issue does arise, why did it happen and how can it be resolved and prevented in the future?
2. What is a check?
To answer your questions about what to do when you experience downtime, we first have to answer how we discover downtime in the first place. To monitor your website, Uptime.com has created a suite of basic and advanced checks.
A check tests for a specific response or function. If there is an error in the function, or the rule is broken, the check will fail. Most Uptime.com checks can be configured at customizable intervals (typically from 1-60 minutes) and with adjusted sensitivities to meet your individual website monitoring requirements.
By creating a variety of check types, Uptime.com is able to efficiently and continuously monitor every facet of your website’s health and performance.
3. How do I know my site is down?
Ideally, you don’t want the answer to be “my users are telling me.” What’s the point of monitoring if you’re not the first to know when an incident occurs? The easy way to answer this question is You’ll know your site is down when you get an alert. But that too is a surface-level answer.
Remember when we said we had a whole suite of check types? Some checks, like our HTTP(S) check, monitor for status 200 OK. This is the, is my site alive? check type. Some checks, like Real User Monitoring (RUM), monitor performance or a certain piece of your site or infrastructure, but in terms of general UP and DOWN sometimes simple is better. Starting with HTTP(S) is a good way to ensure you’re getting alerted when site accessibility is the issue.
4. I can access my website, why am I getting an alert?
Like we said, our suite of checks have the capacity to monitor multiple aspects of your site and its overarching function. A good way to illustrate this is with our Transaction Check. The Uptime.com Transaction Check mimics user action for a particular pathway; like a shopping cart checkout, login form, or subscription process. These transaction pathways are often linked to a third-party tool, or are functioning separately from the page they live on.
If the transaction process you are monitoring fails, it doesn’t necessarily mean that your whole website is down. This is true for other checks as well, perhaps you’re getting an alert for an HTTP(S) check (uh oh). It’s possible that your site is not accessible at some locations, but may still be accessible from other servers.
5. When and how will I get alerts?
Uptime.com supports a wide range of integrations from Slack to PagerDuty to help you get alerts to the right place and in front of the right people. We also send alerts via email and support SMS and Phone alerts.
To answer the rest of the question we have to talk about sensitivity, timeout, and alert structure. Uptime.com will alert you when you want to be alerted because you control the sensitivity settings for your check.
Sensitivity is the number of probe server locations that need to go down for your check to fail. You can also specify the number of retries before failure, and control the timeout threshold.
We have best practices in place to prevent false alerts (we’re not in the habit of crying wolf), but ultimately this is in your control.
If a check goes down in the woods, does anyone get alerted?
Uptime.com automatically adds your default contact to each check you create to notify you of alerts. You have control over the contacts linked to each check as well as the ability to create a schedule based alerting structure among all of your available contacts – PLUS you can add escalations and integrations with third-party platforms. This is handy when your default contact isn’t an SRE. If your check is still down after X amount of time, you can start to notify the higher ups.
6. Can I monitor my site from different locations?
You sure can. We’re all about monitoring anywhere you need. Uptime.com has a growing list of international probe server locations covering over 30 countries and 6 continents. What we want to focus on in this answer is our 360º monitoring approach.
Public websites have backends that support them, and companies have sites nested behind firewalls that support their internal processes. These are frequently accessed by employees making Private Location Monitoring not only a security requirement but a monitoring necessity, especially when you’re looking for the same checks, features and support in your private location monitoring that you expect for your public-facing sites.
7. What does 99.9% uptime mean?
Otherwise known as the Three Nines, 99.9% uptime is the highest realistically attainable uptime statistic. Any company that guarantees 100% uptime is lying.
So what does 99.9% break down to in terms of time? About 8.77 hours per year, or 43.84 minutes per month. The next question is how to display time to your clients and stakeholders.
8. How do I report on my uptime?
With customizable reporting that pulls data from your Uptime.com checks on a regular basis, and then sends it at scheduled intervals to your designated users and stakeholders.
Uptime.com reporting is about the freedom to report how and to whom you want, and being able to adjust the accuracy of the data and metrics you are reporting on. We take a four-pronged approach:
- Define what’s most important to you: control and customization are your call. Uptime.com SLA reports can be customized to show response time metrics, total outages, targets, and uptime for your chosen date range and checks.
- Decide who sees what, when, and how: Schedule reports that showcase the success of your critical sites, checks, and systems, and deliver them to the people who need them – when they need them – via automated report links or API.
- Drill (even) deeper with Real User Monitoring Reports that break down performance by first load time, browser and device type, user location and more to understand the impact of your traffic on your resources and gain insight into your user experience.
- Communicate downtime within the same tool you trust to monitor it in the first place. Reporting on uptime should stem from your monitoring and be adaptable to the data you need to convey.
9. How do I avoid downtime?
This is probably the most useful question and there are a few answers.
Start with a checkup. Don’t just monitor one thing, start with a general domain health check via our Monitor Entire Site tool to generate a series of useful checks that monitor everything from status 200 OK to whether or not you’re on a blacklist.
Downtime will happen. Like we said, 100% uptime is a myth. Downtime will happen, so make your strategy about reducing it rather than trying to eliminate it. Making sure you have a strong alerting and escalation structure helps here, as does utilizing features like our notes field in check configuration to keep a running log of previous incident response.
Learn from the past. Use tools like our Real-Time Analysis feature, Real-Time Check Status, and Alert Log to drill down and discover root cause so you can stabilize your infrastructure and prevent future alerts of the same type.
10. How can uptime monitoring help me in an outage?
Transparency builds trust. Uptime.com has a lot of features in play that help you in an outage. It starts with the data we send you when you receive an alert. We include any notes or tags associated with your checks, plus links to real-time analysis and alert data.
Aside from drilling down to the root cause, we also offer our top-notch human support team to help you understand and resolve the alerts you are receiving.
One of our favorite ways of helping you in an outage (aside from resolution) is incident management. Downtime can be an opportunity to build trust and the easiest way to do this is with our Status Page solution. Status Pages provide a platform for you to connect with your end users and update them in real-time on the status of your outage and the steps you are taking to resolve it.
Status Pages can also display metrics, and historic data on prior incidents as well as convey upcoming maintenance, and scheduled updates that may affect your accessibility.
Monitoring is a complex undertaking that covers everything from how much time your site is UP per year, to how you define what “UP” is, how you report on it, and how you communicate your uptime to build relationships with your users.
Curious if our monitoring solution will bring you peace of mind and help you manage downtime? Check out our free, no obligations trial and test it for yourself!
Minute-by-minute Uptime checks.
Start your 21-day free trial with no credit card required at Uptime.com.