Learning from the AWS Outage: Internal Monitoring Alone Isn’t Enough 

If you have set up your own monitoring services with Amazon CloudWatch, Azure Monitor or another internal tool, we suggest you consider looking beyond the horizon. 

These services often provide internal web monitoring only. Perhaps they validate HTTP availability from locations outside their networks, but HTTP checks won’t give you a 360º view into the state of your services.

 

Internal Web Monitoring: The Message of Dec. 7th 2021

The following is taken from AWS’ own Summary of the AWS Service Event in the Northern Virginia (US-EAST-1) Region:

To explain this event, we need to share a little about the internals of the AWS network. While the majority of AWS services and all customer applications run within the main AWS network, AWS makes use of an internal network to host foundational services including monitoring, internal DNS, authorization services, and parts of the EC2 control plane.

We’re not going to break apart the specifics of the December 7th AWS outage, but we will sum it up by saying that if you’re hosting your own internal monitoring (as Amazon does on their own internal network), it could be an expensive lesson in “putting all your eggs in one basket.” 

After all, what good is performance monitoring if your internal system (which houses your monitoring) fails and leaves you searching through logs line by line looking for the latencies in your own system?

 

Internal Vs. External Monitoring

If you want to take a deep dive into the differences between internal and external monitoring – you can do that here

There are two parts to Internal monitoring. Internal website monitoring checks servers behind a load balancer or firewall. If you’re using a web-based infrastructure like Amazon Web Services (AWS) or Microsoft Azure, the monitoring provided still may only be checking behind that cloud provider’s firewall. While Internal web monitoring is critical – it shouldn’t be your only coverage.

Internal site monitoring is a powerful tool for gaining insights into performance and diagnostics, as well as for capacity planning and upgrading. This information, though useful, will not give you metrics on the end user’s experience.

The other aspect is internal systems monitoring which checks for:

  • CPU load
  • Memory usage
  • Processes
  • Disk space
  • Traffic or bandwidth consumption

 

If you’re using your own hardware, these stats alert you to when memory is low or servers are not performing as they should. This information is useful to plan for equipment replacement, or to justify increasing capacity with additional equipment and upgrades – and is not always supported by a third-party provider.

When it comes to infrastructure and website monitoring, there are great benefits to monitoring redundancies and hedging your bets.

 

Internal Monitoring Gaps

If your internal network has no issues, downtime can continue undetected until one of your users is nice enough to alert you to a problem. Monitoring externally supports domain health and lets you ensure the success of your sites, page elements, payment processes (along with other user pathways), and collect real-time data on the user experience.

As we mentioned at the beginning, some internal monitoring services do include HTTP checks from external locations, but these checks only tell you if a site returns 200 OK. Partial information won’t help you – your site may be reachable but it may not be fully loaded, or fully functional. 

Pro Tip: Uptime.com’s HTTP(S) Check just got a major upgrade. We have our tried and true optional settings that let you search for a string to expect or send HTTP headers (so you can make sure your site isn’t just up but properly loaded). In addition, we based this upgrade on cURL and added SSL/TLS verification, support for chunked content, and a few other handy tools. Check it out.

 

Why Use a Monitoring Provider with External Capabilities?

To detect problems outside your network, external or third-party monitoring solutions test availability from a variety of probe server locations at configured intervals. This gives you geo-specific information on your performance, and on user experience; with key metrics like uptime (of course), and response time.

External monitoring is essentially a layer of protection around your revenue-generating sites and services. Downtime costs money, and you need to know, are all systems working? Are they working where you need them? Are they fast enough? 

From SSL certificate monitoring to Blacklist alerts and DNS, external monitoring provides peace of mind – not only that your site and all it contains is UP – but that when it goes DOWN, you get accurate alerting sent to the right people. 

Uptime.com’s external monitoring checks test each facet of your front-facing infrastructure and convert the data into clear reporting for your clients and stakeholders. All while providing detailed reporting through Uptime reports and Real User Monitoring to help you improve performance and target areas for improvement. Reporting brings completion to the monitoring circle of Monitor, Alert, Report, Improve. 

 

The Whole Truth

Using an internal web monitoring service makes sense. Secure sites and networks need monitoring – but keeping all of your monitoring internal could hurt you in the long run – or in the event of an outage, as it cuts you off at the knees and shrinks your view.

Visibility is everything. Whether you’re an ECommerce company, a university, an IT company, or a large-scale enterprise managing services for several clients, don’t overlook the power of redundancies and failsafes in terms of watching over your infrastructure.

Internal monitoring will tell you if your internal sites and assets are working properly. External monitoring will tell you if your users are able to properly use your website. Uptime.com will tell you if your sites are up, if your transaction pathways are healthy, your certificates are current, and give you metrics into real user experience. 

 

You need both Internal and External, and you might find that you need us too. Give us a try with our free trial, no credit card required. 

Didn’t find what you needed here? Reach out to us directly with your questions.

Minute-by-minute Uptime checks.
Start your 14-day free trial with no credit card required at Uptime.com.

Get Started

Don't forget to share this post!

Emily Blitstein is a technical content writer for Uptime.com. With a background in writing and editing, Emily is committed to delivering informative and relatable content to the Uptime.com user community.

Catch up on the rest of your uptime monitoring news