IT and DevOps Resources for COVID-19

We’re all wrestling with less than ideal circumstances during the pandemic of COVID-19. Whether you’re sheltering in place or simply practicing social distancing, it’s safe to say we’re all adjusting to a temporary new normal. One commonality is the need for connectivity. If infrastructure fails, business will screech to a halt and we will find ourselves in a new kind of mess altogether.

Perhaps now more than ever, we are worried about Service Level Objectives and keeping alert response tight.

The goal of this post is to provide some ideas on how DevOps can remain efficient to the needs of the organization and its users. We’ve also included some resources specifically for healthcare IT workers to help.

Monitoring During a Crisis

Our business is uptime, and we’ve got your back. We have observed multiple sectors, from health services to eCommerce, seeing massive surges in user requests. Downtime during this event is a headache for everyone, so let’s discuss what monitoring for this crisis looks like. Then we’ll explore some methods you and your team can use to stay sane and efficient.

Structuring Downtime Alerts and Escalations

For most of us, monitoring will be business as usual but we may need some minor adjustments to accommodate. Some of us are already receiving alerts, so one of the first ways to optimize your monitoring system is to audit your contacts.

Balance time to respond with tiers of response. Equip lower tiers with the tools they need during an outage.

First: are alerts being sent to the proper contact? How have teams changed since you’ve configured your account? Are all the necessary team members scheduled with on-call hours, and do you have escalations configured?

Right now, we’re all in somewhat of a crisis mode so set escalations on your most important customer-facing infrastructure if you have not already done so. You can worry about your entire account as you get settled. Escalation time limits are up to what makes sense to your business and your end user. Our best advice is to escalate with more visibility. Involve more team members and require ownership over problems quickly. Designate a temporary admin if you need someone to filter the noise. The point is to waste the least amount of time finding someone to fix the problem.

Monitoring Fundamentals for Crisis Management

Every business is experiencing its own unique kind of shift. So let’s first make sure our fundamentals are covered:

  • Home pages
  • Login pages
  • Major landing pages
  • Product pages
  • Infrastructure

Critical API endpoints should also have monitoring, as users shift to various ways of using your system. We all have more time on our hands with a trillion tasks to occupy that time. This dichotomy could create new ideas, general usage could increase alongside demand for services, or users may just fall back on temporary measures as a patchwork.

Expect a higher than average load on all of your systems. Transaction monitoring will be an important tool for measuring performance and outages in more sensitive parts of your app. A multi-step check can inspect multiple elements to ensure nothing critical breaks, mimicking real user actions along the way.

A Note on Status Pages and Downtime

For most of us, some downtime is inevitable. What will make it palatable to the end user is transparency.

An example of an Uptime.com status page showing the various stages of an incident.

It’s more important than ever to update your status pages with accurate analysis when outages do occur. Users are reliant on your services in a way no one is accustomed, and high usage can cause many unexpected issues. Maintaining your status page provides the public with up-to-date information on what your business and keeps control over public-facing communication.

Resources for Healthcare IT

Healthcare IT has one of the most strenuous jobs during this crisis. We wanted to gather as much as we could to provide a one-stop collection of free and extended services available to these users.

Basecamp Offering its Services for Free

If your organization is part of a first response effort, Basecamp is willing to comp your account. This offer is not extended to everyone, so if your organization was impacted by the pandemic you’re not eligible. If you’re in government, or health IT, you may want to take advantage of this tool to help coordinate your response.

Microsoft Teams Available Across NHS

All NHS staff will be able to use Microsoft Teams for free so they can quickly communicate with colleagues during the Coranavirus outbreak. The idea being to reduce the amount of in-person face-time required to diagnose or discuss patient outcomes. Doctors and key personnel can use teams for instant messages, audio calls, video, and to share advice and updates about patient care.

Uptime.com offers service for most major integration providers, with Microsoft Teams on the way. Push notifications where your team lives and work make a difference in resolution. They enforce accountability, and lead to a more efficient process.

Other Resources

It seems everyone wants to help. Check with your local city. Boston is doing its part to track free internet offers, and Citrix is offering its meeting software for free.

Website and application monitoring will be critical in the near term, as sites outside medical cope with unprecedented demand. We’re currently tracking outages in retail, food, and banking as everyone struggles to scale infrastructure and meet the challenges we face. We offer a 21-day free trial, and our services include SSO and multiple user seats available for you and your team. Your customers shouldn’t be the first to know your site is down.

Working Remotely

We work remotely here at Uptime.com, and our team wanted to offer some tips that work for us in these times of isolation:

Maintain Expectations

This one sounds scary, but the truth is we’re all stretched thin by now. Whether you have kids or not, whether you work days or nights, by now we’re all feeling some form of strain. Maintain your professional expectations and allow for some leeway in response time and productivity. You’re not a bad worker because you didn’t tick every task off your list for today. You’ve got to give yourself time to adjust to these circumstances.

Develop a System

Find a comfortable space

Seperate home life from work life by keeping the office in a specific space. Don’t do house chores when you take breaks if you can avoid it, and those of you with partners need to work together to manage a schedule that works for everyone. The best of us can take months to get this right, but it helps to use bullet journaling and apply the principles of GTD (Getting Things Done): less distraction and more productivity.

Reach Out

This is a good time for watercooler chatting. Ask your colleagues how they’re holding up, have voice calls if you normally text or email. Try to shake things up a bit. It’s good to have face time and you will get to a resolution much faster if you maintain good relationships with the team that you’re used to seeing around the office.

Disconnect

As important as it is to stay focused on the task at hand, it’s crucial to devote some time to disconnecting. Exercise is a great way to work out stress without screen time. You can also pick up a book, or try to learn an instrument. Take some time away from devices and your space will feel a little less small.

How are You Holding Up?

How is COVID 19 affecting your work life?

Minute-by-minute Uptime checks.
Start your 21-day free trial with no credit card required at Uptime.com.

Get Started

Don't forget to share this post!

Avatar

Richard Bashara is Uptime.com's lead content marketer, working on technical documentation, blog management, content editing and writing. His focus is on building engagement and community among Uptime.com users. Richard brings almost a decade of experience in technology, blogging, and project management to help Uptime.com remain the industry leading monitoring solution for both SMBs and enterprise brands. He resides in California, enjoys collecting and restoring arcade machines, and photography.

Catch up on the rest of your uptime monitoring news