In the wake of Level3’s massive outage, which had rippling effects across all virtually all carriers in North America, questions persist about VoIP redundancy and alternate routing of calls. Even well-networked carriers experienced over an hour of downtime, while Level3 was completely unreachable; even their website was down. So how could a single carrier cause such a headache for so many?
Simply put, the answer lies in what can and cannot be controlled.
In the world of telephony, any service provider worth their salt will have more than one or two carriers with whom they interconnect. And having multiple routes means that you can control to which carriers you send traffic. This is usually a fairly simple task that can be accomplished in a few minutes. But did you catch the operative word?
It was subtle, but those two letters makes a world of difference. So, here it is again: having multiple routes means that you can control to which carriers you send traffic.
Figure A: Normal Outbound Routing
A service provider – any service provider – can control their outbound routes, the “to”. A call is made, the list of possible routes considered, and a route selection is made and the outbound call is set up, all in a couple microseconds. The time it takes to verify the source of the problem as being with Carrier X usually takes longer than the act of removing Carrier X from your route tables. Depending on the nature of the problem the phone switch may automatically re-route to the next best route choice without any manual intervention.
Figure B: Outbound Overflow To Alternate Route
So, this leaves us with three lingering questions:
- if it is so easy to route around a problem carrier, why can’t it be done for inbound calls;
- if you’ve routed around the problem carrier why am I still having problems completing some calls;
- if your carrier is dependent upon other carriers, why shouldn’t you just use one of them instead?
All fair questions. Let’s look at them individually.
First, why can’t we control our inbound routes? The short answer is, because the call hasn’t reached us yet so we have no control over it. The longer answer is, just as we can control outbound routes to Carrier X, an inbound call to us is an outbound call from another carrier and they control how to get the call to us. If you had only a single route (trunk group) with that carrier and they could not complete the call for some reason then the call would fail. If you have multiple trunk groups, they may attempt other routes to get to you. However, if their network is having issues they may fail the call altogether. The next thing to consider is that the Public Switched Telephone Network was designed so that every local number can only belong to a single provider and therefore that single provider alone can complete the call (we’ll discuss toll free calls in a moment). What that means is that virtually every local number that is “owned” by Level3 likely had issues during the outage.
Figure C: Inbound Failure Never Reaching The Terminating Carrier
Second, if Level3 was removed from routing why were some outbound calls still having problems? This is typically because the PSTN is a mesh of carriers who all interconnect and the overflow carrier attempts to send the call to the carrier having a problem; this probably wasn’t much of an issue for most carriers on October 4, 2016 because most, if not all, were aware of the magnitude of Level3’s issue.
Figure D: Impacted Overflow Routes
Third, while Tier 1 carriers like Verizon and Level3 and Competitive Local Exchange Carrier (CLECs) may provide excellent products over their massive network, they are their own single point of failure and cannot offer multiple carriers to serve their customers’ needs.
So, what can be done to minimize the impact when carrier outages occur? First, ensure that your service provider has ample connectivity to the PSTN. If they don’t then you’re probably not using the right provider. Next, if you have multiple locations, ask that the numbers of the locations be split among different carriers. While this doesn’t eliminate such outages, it can help by minimizing the impact when they occur. Last, consider using a Toll Free number instead of a local number for your inbound calls. Toll Free numbers are unlike local numbers and do not have a single “owner” but instead can be handled more like outbound calls where routing and multiple carriers can be controlled by the Responsible Organization (“RespOrg”).
Evolve IP stands out among VoIP providers by having scores of routes to over 10 different PSTN carriers enabling us to adjust our routes in the ever-changing network landscape, minimizing the effects of a mammoth outage. For more information about how Evolve IP can help your business achieve maximum availability please contact firstname.lastname@example.org.Categories: Unified Communications