Did you have a nice holiday? Hope you weren’t planning on working today….

For anyone who’s worked in the IT consulting industry, you know the week immediately before Christmas through the week after New Year’s tends to be slow.

 

Except for the one-off client needing to make a purchase to spend their budget before it goes away, or the random few tickets we see; we generally expect a fairly quiet time during this time of year.

 

So, you can imagine our surprise here at UCRIGHT when the techs started fielding calls from our clients at 4am EST, on December 27, which continued to snowball throughout the day, the next day and into the wee hours of December 29th.

 

After the first few reports, and the opening of numerous tickets with our customers WAN and UCaaS providers, our techs realized something unusual had to be going on and started to crawl the web for more info.

 

After reviewing the following link, https://downdetector.com/status/centurylink/news/234064-problems-at-centurylink, (we like others) discovered that there was a massive backbone outage and that it seemed to be growing worse as the morning went on.

 

More info on the overall outage and it’s cause is here:

https://threader.app/thread/1078419619436810240

 

Since this was one of those ‘disaster’ scenarios where nothing could be done except to use already existing resources to mitigate the fallout (or not), we at UCRIGHT had to do our best to do triage for our clients.

 

What we did:

 

  • SLA clients – We managed to work 100% of the incoming tickets within the committed 45 min response time.
  • Non-SLA clients – We opted to waive our normal response commitments (2 days) and ended up calling in unscheduled resources to ensure that we could respond to all clients by the COB on 12/27.

 

How we did it:

 

  • Our SD-WAN/NaaS clients were relatively unimpaired.  As we say to all clients almost every day, you never see the value of an extra connection and automated fail-over until you really need it….. and today proved that to all of those who ‘drank the kool-aid’ previously.
  • For the majority of other impacted clients, whom we have already helped procure additional, inexpensive, backup WAN options – resolution was as simple as changing the default route of a switch, router or firewall.
  • And finally, in a few cases where there were no alternative circuits, hotspots, phones we could tether a firewall to or any readily available WAN routes – we could only explain that there was a major outage in one of the internet backbone providers and suggest that they ask users to tether their computers directly to their phones or work from home for the day.

 

Take away from today:

 

No matter how reliable your internet or MPLS has been historically, or what your carrier promises you…it’s inevitable that at some point service interruptions will occur.  Sometimes this lasts for only a few minutes, sometimes it’s for a day or more.  With the expanding reliance upon internet, wan and cloud services this means that even a small amount of down time can be extremely costly to a company.

If your company relies on cloud services or any internet based applications, you should consider the costs that a 24+ hour service outage can create – when you view redundancy from this lens, it is generally quite simple to justify the cost of a backup circuit.

UCRight works with master agents to source and secure redundant circuits to protect business from major interruptions. Reach out to us to get more information.

 

 

 

No Comments

Post a Comment

Comment
Name
Email
Website