Microsoft Teams – Quality of Service

Being a Skype and Teams consultant I seem to spend my life talking about why it is important to implement Quality of Service even for cloud systems routed over the internet.

My mantra is simple, if you can do it and it’s going to make an improvement no matter how little the perception may be, then it’s probably better to do it.

Specifically with cloud there is always an argument that QoS implementation is not worth the effort because of the middle carrying network that is the internet. As we all know, the internet cannot perform QoS. So what is the point?

Firstly, this is a wrong assumption, that there is no point. The second incorrect assumption is that Quality of Service is only possible when you’ve implemented Office 365 Express Route.

The reason for my stance on this matter is because I look for different types of communication between different peers. I look at the intended media route over your network and calculate that a certain percentage of media will always be local to your network. With the implementation of Direct Routing, this percentage increases quite significantly.

As a result of these media paths remaining on your controlled network, you can apply traffic treatment policies to prioritise important data packets end to end and both ways. Your network may be geographically large with different types of inter site connections such as MPLS, Point to Point, or managed ethernet. As a result implementing QoS is no small feat. And for this reason alone I get the most push back on why QoS cannot be implemented at a customer. It’s not the technology, its either effort based, or the fact that the networks team have historically used other treatment methods and they don’t want to pivot away from that.

Once we get over the first hurdle on agreeing that it makes sense to deploy QoS for Microsoft Teams because 70% of the media is going to go end to end over the customer LAN, we then start talking about how this can be implemented.

First off LAN and WAN QoS are fundamentally the same, just different networks. And the endpoint for each of those networks may treat QoS differently. The most important thing from my perspective is that the traffic is treated in exactly the same way over each network regardless. Some WAN networks use accelerators and application inspection to classify data packets based on what the device has determined to be the application e.g. Microsoft Teams. The problem is that in order for these devices to determine the application type, they must inspect the data packet. As Teams transmits media in Secure Real Time Media (SRTP) the data payload is encapsulated in an encrypted packet. This means the device has to decrypt the data packet, inspect it, decide what to do with it and then re-encrypt it and send it on. This requires CPU and memory, but more importantly for us, increases latency and packet reordering and jitter. All bad where media quality is concerned. It is for this reason Microsoft do not support deep packet inspection for Microsoft Teams payloads.

The other challenge we have is WAN acceleration and packet reshaping. Network engineers will want to do this because it means that they can squash more data through the available bandwidth that otherwise would be possible. WAN accelerators basically compress the data packet and then send over the network. The problem with compression is that the data packet has already been compressed by the voice codec used in the SDP negotiation between endpoints, for the WAN to compress the packet again, you have double compression. This leads to data bits being lost and entire packets resulting is poor media quality. Again Microsoft do not support or recommend this for Microsoft Teams.

This leaves us with what needs to be done. Microsoft support policy based QoS using DSCP. Nothing new there. The LAN needs to be configured to transmit packets based on their DSCP classification, as does the WAN. Do not try to re-mark data packets between networks, for instance configuring EF for audio on the LAN but AF34 over the WAN. If you do that you are not gaining anything and contradicting the purpose of Quality of Service. Pick the classification and trust the packet end to end.

Microsoft publish their QoS recommended classifications for media types. For Teams, this is EF (46) for audio, AF41 (34) video and AF21 (18) for app sharing. It is an incorrect assumption that you must assign these values to Teams traffic for Quality of Service.

Yes, it would be nice to have, but the reality is often very different. Most enterprises have very strict controls over what type of application can use EF. For instance the most common entry criteria is that the application must have call admission control. Microsoft Teams does not have this ability.

EF is an expensive classification for them in the way that it operates as well as if they have a managed WAN then they are probably paying for a preset static amount of EF bandwidth. This bandwidth is precious to them as other business critical applications could be using this priortisation. If you go ahead and deploy Teams on EF then you could bring down several systems as a result.

The actual reality is that you are aiming for the best classification you can get in the AF band. You want the top classification that no other application is using so the data packet you are transmitting gets the best treatment and prioritisation possible. The net result is the almost the same experience as EF. In some ways it is better because like my last point around EF is expensive, by classifying in AF means you can use the general bandwidth available which would be much higher to your heart is content and get full prioritisation over it for no additional cost to your customer. Its a win win compromise.

Once you have this implemented at the network level, you need some way to mark the data packets accordingly. You do this today using group policy.

2018-07-24_23-20-15

Don’t forget that if you are wanting to implement QoS then do it properly, it doesn’t just end with this GPO for Teams.exe. Where will these users be calling? Desk phones? Direct Routing SBCs?

You’ll need to ensure that these devices are configured themselves for QoS otherwise you are only getting QoS on the sending stream from the Teams client and potentially none on the receiving stream. The end result is 50% of the possible experience to each of the users.

You can test whether data packets are being correctly marked by using Wireshark to capture the data packets. You are looking for a UDP stream to the target endpoint on the source port within the media range

2018-07-24_23-26-57

But remember, any packet that is destined to be transmitted over the internet will only be priortised on your network, up to your boundary. After that QoS does not come into play and the packet is sent like any other data packet.

The same is said from any inbound data packets from the internet. For instance, you receive a pstn call from Microsoft Phone System. The packet is being transmitted from Microsoft via the internet to your network. It is not prioritised and similarly any markings that were stamped by Microsoft’s media network for DSCP values are stripped by the internet routers. This means the inbound stream has a DSCP value of 0

2018-07-24_23-33-13

Therefore, you are effectively only getting 25% of the total streams treated for Quality of Service i.e. Outbound stream client -> your boundary.

Network inefficiencies cannot be hidden from media traffic

If you’re going 100% cloud for calling and meetings, then you really should consider your internet breakout design, capacity and performance. It may be more cost effective to implement local breakouts at sites, rather than purchasing Express Route. But one thing is for sure, in an enterprise organisation, if you want enterprise grade voice quality then you need to guarantee your media quality end to end. Otherwise, there will be times where there are degraded experiences.

Lately, Microsoft have been rolling out meeting settings to the Teams admin portal and one of those settings is an enable Quality of Service markings for real time media.

2018-07-24_23-38-28

You would assume that this setting would mark the traffic coming out of the Microsoft network and replace the need for group policy based QoS?

At the time of writing this appears not the case. Perhaps this feature has not yet made it to the client. The setting certainly suggests it should.

However, this setting will presumably apply the recommended DSCP markings to data packets and that could be in breach of your design. In this case, you would still rely on the GPO method.

From the testing I have done at the moment this setting does not actually mark any inbound or outbound data packet to the client.

In any case, when this feature is fully working it still is not going to solve your problems without you putting the effort in to support it. While you can be pretty consistent and controlled for LAN to LAN communication, you need to remember anything going to the cloud or coming from the cloud is not going to benefit from QoS, unless you have Express Route.

Deciding whether you need that or not depends on your usage prediction.

In summary, Quality of Service is still an important element of deploying a cloud voice solution, but you must understand your usage profile to weigh up the reward vs effort to implement. If you’re going 100% cloud then QoS will only play a meaningful role in internal P2P comms. You must focus your efforts in ensuring there is sufficient capacity and performance in your external network to support a good standard of quality albeit uncontrolled. If you want the best experience in this scenario, then Express Route may be a consideration for you, but not necessarily mandatory.

Advertisements

3 thoughts on “Microsoft Teams – Quality of Service

  1. Good article. Seeing that the 365 portal doesn’t include any DSCP values to choose along with the port range choices, it is most likely just an inband provision setting to make sure all teams clients are using the custom ports for the different modalities, allowing us to stamp our QoS values per those port ranges (which you’ve done via GPO). We had the same inband provisioning requirement for lync/skype onprem and now glad to see MS has made this setting available for teams. Without this portal port setting, it’s possible a/v/sharing could have port overlap and we could be marking traffic for the wrong modality.

  2. Hey Mark,

    Thanks for the informative article. Our network team is saying that the current QoS for VoIP phone use will be retired when we implement sfb/teams as the per site wan bandwidth now has massive capacity eliminating threat of congestion. Also they now and private peering with Microsoft addressing latency concerns. This doesn’t feel right but I can’t come up with a solid reason to dispute their intention to remove QoS. Are they right?

    Regards,
    Matt

    1. Hi Matt

      There are many arguments with networks vs SfB implementers over this topic. Basically it boils down to networks not wanting to put the effort in to configuring hundreds of switches and routers to support the recommendations. Unfortunately, simply increasing capacity does not remove or fix quality issues. This is a false assumption. In fact bigger “pipes” provide more scope for quality issues. By the nature of converged voice and data you’re simply going to amplify the issue. For instance I was diagnosing a poor conference the other day and it was a participant on Virgin Media 1GBps broadband that was the problem stream. Switches and routers need to understand how to prioritise voice and this can only be done using DSCP. Without it you’re going to get jitter, delay and packet loss because switches will queue the traffic when the network gets congested. OK congestion may be a small factor, but what you need to remember is data packets are not sent at a constant rate, its nearly always burstable. During these burst periods, your voice packets are going to be affected. Qos protects voice packets from these bursts and makes sure they are never queued, lost or affected by other data transmissions. Also not sure what they are on about with private peering, have they deployed express route for office 365?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.