If you’re like me then getting your hands on video conferencing scenarios is like your nan’s birthday. It comes around once a year and you spend two months trying to figure out what to get for a present. I am not ashamed to state that video is not my strong point especially where interoperability is, it’s just that the opportunities that have come my way have been more centered on enterprise voice than video interop.
Recently, I tried to fill that gap a bit and went on some Pexip training to see how this stuff is done. I don’t know why but I always thought it was more difficult than it turns out to be. Really the only thing different between voice and video is how the codecs behave, the architecture and protocols are fundamentally similar to voice and somehow I knew that, I just thought it was more difficult than it was.
So at a basic level of understanding we have a device in a room, it does video conference stuff and we need to be able to join this device to another device on another platform that is also doing some video stuff but differently and we need some glue in the middle to make that happen. Enter, Video Interop.
Why can’t these devices just connect? I mean video is video right?
Well to answer than the answer is pretty much no. Remember Blu-ray and HD-DVD? They both are capable of digitally storing a movie but you need a special player to play either Blu-ray or HD-DVD. The same can be said for video conferencing. Each system generally offers the same functionality and delivers the same output, but invariably you need dedicated and vendor specific hardware to use it. What if you wanted to loan your Blu-ray copy of The Fifth Element to your friend who had a HD-DVD player? You can, but they would have to go to the store and buy a blu-ray player…
Of course, I am speaking generally, and HD-DVD never took off, but the point is valid. This is the problem in video conferencing world where you have some gear in your meeting rooms that you want to use inside a Microsoft Teams meeting. How do you get this stuff to work?
The first problem we have to understand is the Microsoft Teams meeting architecture. Fundamentally, this is a H.264 video meeting space which means that any endpoint wanting to use video inside a Teams meeting has to be capable of sending and receiving H.264 video streams.
The second problem is that the Microsoft Teams client is dedicated to the Microsoft Teams meeting ecosystem, meaning that unlike Skype for Business, it is incapable of joining a meeting space hosted by another platform.
The third problem is that your video conferencing endpoint is probably either using h323 or a variant of H.264 that Teams doesn’t understand.
The fourth problem is that Microsoft Teams doesn’t use SIP in the meeting context, so even if your video conferencing endpoint uses SIP, you still have an interop problem to solve.
So your organization is moving its meeting space to Microsoft Teams. How do we solve this problem? Cloud Video Interop (CVI)
Firstly, you need a product to act as a middle man, that is able to ingest the signaling protocol and video codec supported by your video conferencing endpoint and convert it into a Teams compatible signal and codec, send it to the Teams MCU and vice versa. There are 3 products on the market for this right now and they are Pexip, Polycom Real Connect or BlueJeans.
Whichever product you choose, one thing is consistent across all three products. The Microsoft Teams connector (the server that connects the systems transcoding servers to Microsoft Teams) must be installed in Microsoft Azure. The rest can be anywhere, but this connector cannot live anywhere else due to Microsoft certification requirements.
The question now becomes are you going for a SaaS CVI or On-Prem / Private Cloud CVI?
This post is not going to argue which one you should choose, but consider the impacts of the decision you’re about to make.
If you’re going with a SaaS solution, this of course is a faster route to delivery and the benefits of OPEX subscriptions means that within a short period of time the high level objective is achieved. However, one thing to be very conscious about is understanding the architecture and limitations of the SaaS product you have bought into.
The biggest considerations is understanding how interop works. To do this is down to how meetings are organized. If you’re using Microsoft Teams, the meeting space will always be held within Microsoft Teams and an Office 365 datacenter. This could be in the same datacenter as your tenant, or it could be in some other. However, that generally doesn’t matter too much because the Microsoft internal network for Office 365 and Azure is super efficient it almost becomes a moot point. The most important note is that it is Microsoft hosting the meeting.
So now your video conference endpoints need to join a Teams meeting. They do not actually join directly. In any interop scenario, they will register to your interop service and that will spin up it’s own conference of a kind that the video endpoint will join. The interop service will then connect it’s conference to the Microsoft Teams meeting via the Teams connector in Azure. Transcoding happens on the interop service not the connector.
The next point when considering SaaS is the datacenter location of their transcoding and connector services. There is little point in signing up to a service that has one global point of presence the other side of the world to your video endpoint or tenant because that would introduce some massive latency, packet loss and jitter issues, which is generally considered bad for video and voice.
If the SaaS solution has sufficient global coverage, then maybe it still is a viable solution to consider, if your internet links are optimized with this service in mind.
The other option is to use an on-prem solution, or private cloud where you can control the media path more optimally. Generally speaking, it is better to perform transcoding locally than in the cloud and sending H.264SVC streams over the internet is preferred as its considered more tolerant to network impairments, but with data connections being better than what they used to be, the performance gap is reducing.
With solutions that use distributed interop, on-prem can be really efficient in scenarios whereby you have multiple video conferencing endpoints wanting to join the same Teams meeting but from all over the world. In this scenario you can have internal transcoding servers located geographically closest to each of the video endpoints. Each endpoint would connect to the closest transcoding server to them, media between endpoints would then switch locally across your LAN & WAN whilst only sending the required media to the Teams conference so that users joined in by the Teams client can interact and participate.
So on-prem can from an architectural perspective be more appealing in multi-endpoint scenarios and where SaaS doesn’t have the coverage it needs, but comes with the costs of hardware, which for video isn’t going to be cheap entry level server, but a mid to high powered performance server at least 5 figures per server.
In summary, there is no real definitive outcome as to what you should do. Financially it makes sense to look towards SaaS interop with Teams and as we know these days financial incentives tend to win the race. But this does come at a cost that is usually paid in reduced user experience if poorly implemented. For out and out performance, on-prem still wins the race and is the most optimal solution money can buy, but you have to have the money in the 1st place.
I’ll finish with a suggestive approach for you to consider. If your company are light users of video conferencing suites and your persona investigation has proved that device usage will reduce even further with the implementation of Microsoft Teams, then you’d probably want to consider SaaS as your primary solution for CVI.
If, however, your organization are heavy video conferencing users and this is to continue or increase with the implementation of Microsoft Teams, then you’d probably want to consider an On-Prem solution first over a SaaS.