A Proactive Approach To Holiday Season Monitoring
Big sales make up a huge chunk of eCommerce annual business, with shoppers having spent $10 billion plus during Black Friday 2020 alone. The right holiday can mean a big deal for your operations. However, with those windfalls come the breaks aimed squarely at crippling your devops pipeline.
In many ways, waves of traffic are what you’ve been building for, but sudden bursts are difficult to test for and anticipate. The situation changes with the tides.
The proactive approach to holiday uptime monitoring is all about using the off-peak time you have to gather the most valuable and actionable data you can. Here are some tips about how monitoring can help.
Response Time and Downtime are Linked
There are two types of response times we think are critical here: server and site response time, and time to respond in an outage.
With big sales come even bigger spikes in traffic, and potentially multiple outages. If your team is not confident and swift in its response, those outages can cost anywhere from tens of thousands to millions.
But downtime is never such a static event, and rarely do we wait for it to resolve itself. Outage diagnosis and response requires delegation, analysis, and ultimately action.
Alerting Drives Incident Resolution
When downtime strikes, your team needs time to investigate the problem and a clear chain of command before any resolution can happen under pressure. If well executed, this process begins at the moment the alert is issued.
This is where Uptime.com performance monitoring can offer a variety of tools to diagnose and escalate alerts so the right person gets the alert the moment downtime is detected.
Did you know?
Over reliance on a single channel for alerting can hurt! Who will claim the issue? How long has it been going on? Did anyone catch it in the zillion other threads you have going? Sending to just one person or location is a vulnerability. What if this person is unreachable? What if your team messenger goes down?
Multi-channel alerting, like multi-layered monitoring, creates visibility and accountability. Every instance of downtime is caught, logged, and if done right also acted upon. Want to see what a practical alert structure looks like? Say no more:
- Initial alerts logged in team Slack channel designated for site downtime
- Alerts sent via email initially to tier one team members
- SMS messages escalate alerts to service owners and admins
- Phone calls keep the sense of urgency and alert higher tiers
Using the right combination of escalations and on-call scheduling, you can delegate alerts down to the minute.
Speed and Location Matter for Holiday Shoppers
Uptime expectations go hand in hand with real-time deals.
Sales tactics are growing in sophistication, but customers are ahead of the game. They will go somewhere else when you can’t provide them the chance to buy right now.
In fact, we have seen conversion rates drop by as much as 4% for each additional second of load time after those critical first 5 seconds.
Third-party services like your load balancer and content delivery will also see unprecedented traffic during the holiday sales season. Everyone is hitting that refresh button and navigating from store to store.
The scary fact is, everything on your end can look green while your customers around the globe are seeing a very different story.
Most of our checks, and indeed the backbone of any monitoring system built with Uptime.com, use probe server locations from six continents across the globe. We use multiple locations to catch regional outages with near pinpoint precision. We have seen it all, from bad patches bringing down entire systems to shark attacks on underwater fiber lines (yes, it’s happened), and we’re here to help when the “it can’t possibly happen” comes true.
Uptime.com offers continuous uptime monitoring from each probe server you choose at the intervals you designate. That’s third-party confirmation of whether the site is up or down from nearly anywhere in the world.
Optimize and Test to Decrease Latency
Uptime.com checks use a baseline SLA number that you can completely configure to meet your needs and obligations. This baseline also drives both reporting and alerting with color-coded graphs that tell you at a glance just how close you are to an SLA violation. But one of the most useful tools for user experience monitoring is powered by data from the user sessions themselves.
Real User Monitoring, or RUM, takes performance-centric monitoring to the next logical step. Using anonymized user session data, Uptime.com uses real session date to build reports that visualize the user experience broken down by key data points such as:
Page by page, each RUM report contains performance data that gives insights into your user’s experience. You can also combine RUM checks with other checks to learn even more. One of our favorite use cases involves Transaction monitoring, where you can test a goal funnel for website performance and downtime.
Holiday consumer seasons are critical times for devops and marketing to come together, as both can gain lots of valuable data from testing. For example, RUM checks can help inform both teams as to how a sales event performs. Devops can see how users were impacted during peak traffic spikes, while marketing gets clear data on performance and can weigh the balance of assets and calls to action.
Setting thresholds on RUM checks allows your team to track how often performance lags behind expectations with actual user data backing up those assertions.
Reporting on Your Efforts and Building the Future
The final component to any well executed sales season is reporting on your efforts and performance. Easily overlooked as just one more task to add to the stack during an already ludicrously busy season, reporting can function as a proactive tool to optimize your site and talk to customers.
When something goes wrong, everyone starts talking. Expect that your management, staff, and users are all asking the big question: when’s it coming back up? They might argue on social media or in private messengers, but if you don’t control the narrative you will just look incompetent and untrustworthy.
During the holiday sales season, everything is polished to the nines – from window displays to online ads. Why not take some time leading into the season to brand your own status page? Incident management tells users what’s going on and what you are doing to fix the problem, and presentation matters.
It shouldn’t be surprising that the biggest brands in e-commerce devote time to designing a website and a status page that integrate seamlessly with one another.
Benefits of Setting a Baseline
Setting an SLA baseline helps establish the guidelines that determine your team’s technical and operational success. It can also highlight areas for improvement. Reporting should be a collaborative process between development, operations, marketing, and even your support and executive teams.
That’s why Uptime.com scheduled reporting offers the ability to send directly to these third-parties. You don’t need to invite users or jump through hoops. Simply select the time interval, daily, weekly, monthly, quarterly, or annually, and Uptime.com sends a detailed report.
Important Questions to Ask
We would like to leave you with some questions you can ask to test if you are prepared for this year’s eCommerce season:
- How many incidents in the past few months were generated versus acted upon?
The answer to this question can help teach your team which systems are integral to uptime, and how failing systems impact one another.
- What was your average incident response time? What was the response time of your team between alert receipt and action?
These questions help frame the impact of your team’s efforts, and could assist in building playbooks for future incidents.
Data from Uptime.com can tell the story of the season, and with effortless visual reporting you can deliver automatically or build it yourself. Start this process now as your team is gearing up for record-breaking sales. You’ve likely built towards these moments all year long, so make sure you follow through with comprehensive monitoring and visual reporting to communicate what went wrong and how your team worked to fix it.
Minute-by-minute Uptime checks.
Start your 21-day free trial with no credit card required at Uptime.com.