Businesses that consider hosting their telephony in the cloud often think about survivability as a critical requirement 🔥
It is therefore no surprise that vendors are trying to provide their respective solutions for “survivable” cloud telephony (i.e.: provide service when the cloud is unreachable):
- I’ve already discussed about Site Survivability for Webex Calling and its architecture (you can check the article here);
- then there’s Microsoft, which is actively improving their Survivable Branch Appliance (SBA) solution;
- or Zoom, leveraging their “node” architecture to provide survivable telephony for Zoom Phone.
But this is not a market review 😁, in fact, today I want to focus on a special scenario that is very relevant to many customers, describe the problem and most importantly try to give some ideas for a possible solution.
Survivable Queuing
The Scenario
Imagine you have configured some sort of queueing or automated attendant in the cloud, and calls are successfully routed to the agents (or knowledge workers) on their internal extensions. Calls are coming in from a PSTN link on-premise, the typical PRI circuit, where a DID number is being called from outside the company by customers, suppliers, etc…
All is flowing perfectly, until disaster strikes and the connection to the cloud is severed.
Although phones (or soft-phones) might re-register to the survivable node, the inbound DID service is now orphaned from its call control. And all phones previously attending the queue, albeit capable of receiving new calls, fall terribly silent. Customers calling from outside would hear some sort of busy tone, or even worse, an error message saying that the number they are calling is not in service 😧❗
The Solution, a local IVR service
Since calls are coming in from a ISDN PRI, there is an on-premise component that routes these calls towards the cloud PBX: the Local Gateway (LGW in Cisco lingo). In most cases it should be possible to converge both the LGW function with the Survivability Gateway (SGW) function into a single router.
With a new functionality added to the Survivability Gateway (SGW) we can close the gap and provide a full replacement for queuing and intelligent call routing. The technology used to add this service is Voice Tcl IVR, contained in Cisco IOS-XE. Let’s see how we can make this work.
Baseline Setup
I’m going to assume the following:
- you already have a working setup for LGW and SGW;
- the inbound number is mapped to a queue, we’ll set this DID to be +1-565-555-1234
- there are agents with extension 2001, 2002 and 2003 (these are extension-only, but could also have their own DID, inconsequential for the purposes of this setup)
B-ACD or other Tcl IVR application
The first thing you need to do is create a new IVR service on the router, and assign it to a dedicated dial-peer. For simplicity I have reused the B-ACD service that is part of the CallManager Express (CME) solution. You can find this software on Cisco website, and follow
the guide in order to configure it. Make this dial-peer match on a string that contains letters, so we don’t accidentally route to it. In my case I used “A1A99999
”.
Hunt Groups
Next, you want to setup an hunt-group for the agents:
voice hunt-group 1 sequential
list +02001,+02002,+02003
timeout 8
pilot <pilot # used in B-ACD>
Notice the weird +0
prefix in front of our agents’ extensions, I’ll explain it later.
For the pool of devices that are used by agents, it is recommended to set the busy trigger to “1”, so that the ACD service can hunt correctly when agents are already busy on a call:
voice register pool 1
busy-trigger-per-button 1
Fallback dial-peer
We’re almost done. We need to configure the router so that it forwards the incoming ISDN call to the IVR service when WxC proxies are unreachable. In order to do that we define a dial-peer with lower preference (the higher the number the lower the preference) that will match after failing to contact the cloud.
dial-peer voice 999991 voip
description FALLBACK TO AA DURING SURVIVABILITY
preference 10
session protocol sipv2
session target ipv4:<local_IP>
session transport tcp
destination e164-pattern-map <ID>
dtmf-relay rtp-nte
codec g711ulaw
no vad
!
dial-peer voice 999992 voip
description DISTRIBUTE CALLS TO BACD AGENTS
destination-pattern +0....$
session protocol sipv2
session target ipv4:<local_IP>
session transport tcp
dtmf-relay rtp-nte
codec g711ulaw
no vad
The key thing is making sure the preference value is higher than the one used in the dial-peer that forwards calls to WxC during normal operations.
Final touch: translations
We must add one final ingredient to the recipe: the router must create a new leg when it tries to reach either the IVR service or the agent. For this reason we need some “dummy” translations (remember the +0
?) for both the IVR service and the hunt-group targets:
voice translation-rule 999991
rule 1 /^.*/ /A1A99999/
!
voice translation-rule 999992
rule 1 /\+0\(....\)/ /\1/
!
voice translation-profile TO-BACD
translate called 999991
!
voice translation-profile TO-AGENTS
translate called 999992
and we apply them to the relevant dial-peers:
dial-peer voice 999991 voip
translation-profile outgoing TO-BACD
!
dial-peer voice 999992 voip
translation-profile outgoing TO-AGENTS
See it in action
You can see a demo of the final result in this video. There is no IVR menu, to keep things (reasonably) paced for the demo, but it’s possible to add it.
Note
This video is not sponsored/funded or endorsed by Cisco Systems. This is for educational purposes only. Under no circumstance this post represents a design recommendation, and can not substitute specific IT architecture and engineering work for a given use case. In no event shall the author be liable for any loss or special, indirect or consequential damage of any kind resulting from the use of, access to, or reliance on the information contained within this website. No claim of official support can be made based on the sole content of this blog post.