Table of Contents
Table of Contents
Many businesses rely on SD-WAN services to deliver optimal Internet, cloud, and UC performance. But like any network, SD-WAN can experience network issues that affect user experience. So you need to be ready. Keep reading to learn about the most common SD-WAN issues.
This article is part of a series of articles about monitoring and troubleshooting SD-WAN networks before, during, and after migrations. The articles include:
- How to Monitor SD-WAN Migrations
- How to Monitor MPLS to SD-WAN Migrations
- How to Monitor SD-WAN Networks
- SD-WAN Troubleshooting
- Most Common SD-WAN Issues (this article)
Essentially, the most common SD-WAN issues are caused by network bandwidth congestion (bottleneck) or high network devices ressources usage (High CPU). This usually occurs on the Local Loop or the customer Edge Router, which are both prone to network congestion.
In addition, most of the problems in an ISP's backbone that can cause SD-WAN issues are related to congestion on their peering and transit paths with other networks or Service providers.
Although ISP's backbone are more reliable and robust than other network infrastructures, performance issues can still happen.
The same goes for SD-WAN networks in general. SD-WAN vendors promise their solution is magic, but the user experience is not always magical.
Keep reading for more details about where SD-WAN issues occur, and concrete examples of the 3 most common issues and how to identify them.

Since always, the weakest link in a network has always been the last mile. The last mile is the last segment of the network, which generally has the lowest speeds, the least route diversity and the most single points of failure.
SD-WAN networks are no exception to this rule.
So it wasn't a surprise for our team of network pros to discover that 75% of our customer base experience SD-WAN issues located on the last mile of their network.
This is why most SD-WAN networks rely on more than one link.
The assumption is that, if a problem occurs, it should not affect all the links at the same time and the SD-WAN Edge Router should be able to load-balance the network sessions on the best link available. But link diversity on its own is not enough to avoid all issues that can happen in SD-WAN Networks.
To truly understand what SD-WAN issues can happen, you need to first understand what SD-WAN networks look like and where the SD-WAN issues can happen in the network.
The image below is a diagram of an SD-WAN network site communicating with a Data Center, Head Office or IaaS.
A. The Underlay
- The Internet
- Internet Local Loop
- Internet Provider’s Edge Router
- ISP Backbone
- ISP Peering Point
B. The Overlay
- IPsec Tunnel from one site to another
C. The LAN
- SD-WAN Edge
- Core & Distribution Switches
- Access Switches
Before we dive deeper into the 3 most common SD-WAN issues, it’s important to understand that there are a variety of network problems that can affect your SD-WAN network.
- Some common SD-WAN issues include:
- Defective cables or connectors
- Bandwidth congestion (bottleneck)
- Device misconfigurations
- Device software issues
- High CPU usage
- Physical/ hardware issues
- Human errors
- DNS issues
To be able to identify and troubleshoot SD-WAN issues that severely impact the user experience, you need end-to-end SD-visibility.
To do so we recommend a modern decentralized Network Monitoring tool like Obkio Network Performance Monitoring software which continuously monitors end-to-end network performance with synthetic traffic using Network Monitoring Agents.
Get started with Obkio’s free trial! Or check out our blog post on How to Monitor SD-WAN Networks.

Once you’ve identified any SD-WAN issues in your network, Obkio’s network monitoring solution will also allow you to collect the data you need to troubleshoot these network problems.
We talk about Obkio’s SD-WAN troubleshooting steps in our article on SD-WAN Troubleshooting, but here is a summary of the steps.
- Analyze live data or alerts received from your monitoring solution to look at what network locations are currently experiencing poor performance
- Isolate the issue and focus on the location with the worst performance
- Look at past historical data
- Isolate when the issue first happened and its pattern
- Once you know what happened, look at your historical traceroutes to pinpoint where the issue happened
- Identify if the issue is internal or external (in the ISP network)
- If the issue is on your ISP’s side, open a support ticket with information from Obkio
- If the problem is internal, resolve internally
Learn how to troubleshoot SD-WAN issues using Obkio Network Monitoring software and key SD-WAN troubleshooting steps.
Learn moreAs we said above, there are a variety of SD-WAN issues that can occur, but some happen more often than others.
As we said above, the majority of SD-WAN issues happen in the last mile, generally in the Local Loop or the customer Edge Router.
So we’re going to show you the 3 most common SD-WAN issues using concrete examples, and show you what they look like using Obkio’s Network Monitoring Software. We’re going to focus on SD-WAN issues happening in Branches #1, #2, and #3, which you can see in the Chord Diagram below.
With a tool like Obkio, you can identify and visualize SD-WAN issues, and be alerted as soon as they happen.
The first SD-WAN issue is high CPU usage on SD-WAN Devices affecting all sessions. This generally occurs when a network device does not have enough available resources to manage the throughput.

In the screenshot above, we can see an Obkio Dashboard for a Branch #3 with various Obkio’s performance graphs. The selected view shows performance over the last 8 hours.
Column 1 shows the UDP monitoring session performance from the Branch 3 Monitoring Agent towards the SD-WAN user experience Monitoring Agents.
- The first graph shows the Internet SD-WAN user experience
- The 2 bottom graphs show the experience of the Internet connections (ISP 1 & ISP 2)
After reviewing the information from the dashboard, we can see that:
- There is poor performance caused by high packet loss sequences, affecting all the traffic going through the SD-WAN network
- Both ISP #1 and ISP #2 are being affected
When analyzing the historical data on the dashboard to find a trigger or a pattern, we see that this is an intermittent problem (happens on and off) and doesn’t follow a specific pattern.
For ISP #1 and ISP #2 to be affected, this means that the network problem is happening on a network segment that is common to both ISPs.
Column 2 shows SNMP Polling (Device Monitoring) on the SD-WAN Edge Equipment and metrics for CPU Usage and Bandwidth Usage.
- Let’s focus on the CPU usage on the Firewall
- At the same time that ISP #1 and ISP #2 are experiencing performance issues, we can see that the CPU usage is at 100%
This is not a local loop issue and you don’t need to call your ISP. Like with the first CPU usage issue, this is a local problem. Significant traffic is being sent to that port, perhaps from a different application.
This could be in the LAN, or directly on the SD-WAN Edge Router.
Problems on Edge Routers are very common, because they are usually security devices with lots of features and software. The software and features are very resource intensive and can affect your CPU usage.
In Column 3, we can see performance for Zoom and Microsoft Teams call quality.
- When ISP #1 and ISP #2 are experiencing performance issues, it also affects Zoom and Teams call quality.
That’s because, if the CPU of the network device doesn’t have the power to treat the packets in real time, you’ll then experience high packet loss.
Packet loss can then affect the performance of network devices, as well as UC applications like Zoom and Microsoft Teams.
In this situation, the SD-WAN problem is happening on a local network device, and not in your ISP’s network. So it’s up to you to troubleshoot.
When High CPU usage starts:
- Look at the device logs to understand what process started at this time.
- Identify software bugs in your device.
- Look into if a software update was recently done and roll back to an older software version.
- Update your device’s firmware
- Look at Network Device Monitoring to understand if high CPU usage is happenening simultaneously with high bandwidth usage (not in this exemple).
- If high bandwidth usage is the cause, look at the firewall logs to understand if your traffic is legitimate or not.
- Manage priorities in your Firewall to prioritize certain traffic.
- Upgrade to a bigger device.
After deciding on a resolution, look into the real-time data from Obkio's monitoring tool to see if your chosen course of action solved the issue.
The second SD-WAN issue is on the the underlay of ISP #2 caused by high bandwidth usage.
In the screenshot above, we can see an Obkio Dashboard for a Branch #1 with various performance graphs. The selected view shows performance over the last 8 hours.
Column 1 shows the UDP monitoring session performance from the Branch #1 Monitoring Agent towards the SD-WAN user experience Monitoring Agents.
- The first graph shows the Internet SD-WAN user experience
- The 2 bottom graphs show the experience of the Internet connections (ISP 1 & ISP 2)
ISP #1 doesn’t show any performance issues:
- The solid blue line tells us that the latency is stable
- The different shades of blue suggest there is low jitter
- We don't see any yellow or red bars, which means that no packet loss is detected.
ISP #2 shows a clear performance issue caused by high packet loss measurements.
We need to focus on the top graph in the first column, which is the user experience.
When ISP #2 started experiencing issues, the users were using that link and also experiencing the issue. At some moment, the SD-WAN service switched from ISP #2 to ISP #1.
The issue stopped from a user standpoint because it switched to ISP #1, but ISP #2 is still experiencing issues although it isn’t being used.
At some point, the issue seems to stop, the SD-WAN service switches back to ISP #2. Then the issue comes back again on ISP #2 and the users start experiencing the issue again.
Column 2 shows SNMP Polling (Device Monitoring) on the SD-WAN Edge Equipment and metrics for CPU Usage and Bandwidth Usage.
- Let’s focus on the Bandwidth usage on WAN Port #2
- At the same time that ISP #2 is experiencing high packet loss, we can see that the bandwidth usage is over the available 500 mb bandwidth service.
From here we can see that the bandwidth usage is over the limit and determine that the high bandwidth usage is causing the packet loss. Obkio’s tool would have alerted you about the high packet loss with a Smart Notification.
In Column 3, we can see performance for Zoom and Microsoft Teams call quality.
- When ISP #2 is being used and experiences high packet loss, it also affects Zoom and Teams call quality.
This is not a local loop issue and you don’t need to call your ISP. Like with the first CPU usage issue, this is a local problem. Significant traffic is being sent to that port, perhaps from a different application.
Since the SD-WAN problem is happening on a local network device, your ISP can’t help you here.
- Look at the firewall logs to understand if your traffic is legitimate or not.
- Manage priorities in your Firewall to prioritize certain traffic.
- Change the backup schedule
- Rate limit the flow of traffic
- Upgrade your Internet connection bandwidth with your ISP if you’re out of bandwidth.
You can then use Obkio’s Live View to see the effect of the changes you made on ISP #2 in real-time.

The 3rd most common SD-WAN issue is an ISP Local Loop issue on the underlay.
In the screenshot above, we can see an Obkio Dashboard for a Branch #2 with various performance graphs. The selected view shows performance over the last 8 hours.
Column 1 shows the UDP monitoring session performance from the Branch 2 Monitoring Agent towards the SD-WAN user experience Monitoring Agents.
- The first graph shows the Internet SD-WAN user experience
- The 2 bottom graphs show the experience of the Internet connections (ISP 1 & ISP 2)
ISP #1 doesn’t show any performance issues. The solid blue line tells us that the latency is stable, and the different shades of blue suggest there is low jitter. We don't see any yellow or red bars, which means that no packet loss is detected.
ISP #2 shows a clear performance issue.
Column 2 shows SNMP Polling (Device Monitoring) on the SD-WAN Edge Equipment and metrics for CPU Usage and Bandwidth Usage.
- Unlike in the previous example, there is no high bandwidth usage being shown.
- This is not a bandwidth issue related to a lack of resources from the SD-WAN Edge router.
In Column 3, we can see HTTP performance for Zoom and Microsoft Teams, which is the same as the Network Response Time of the load-balanced session (top left corner).
- When ISP #2 experiences performance issues, it also affects Zoom and Teams call quality.
- The issues on Zoom and Teams happen around the same time as they occur on ISP #2.
For more information, we’ll be using Obkio Vision, Obkio’s free Visual Traceroute tool that runs continuously to interpret Traceroute results to identify network problems in your WAN and over the Internet.
By looking at the traceroute below, the issue seems to be introduced right from the 1st hop, and we can see that only ISP #2 is affected.
The SD-WAN problem is happening on the Local Loop, between the ISP Edge and SD-WAN Edge Equipment.
In this case, the problem is related to your ISP, so they are responsible for solving the problem.
Obkio’s Visual Traceroutes are able to identify problems anywhere in your network (ISP and AWS, ISP and Peering etc.), detect that they are a performance issue, and validate that the issue is not on your end.
Learn how to use Obkio Vision’s Visual Traceroute tool to troubleshoot network problems with traceroutes both inside & outside your local network.
Learn moreFirstly, you want to make sure that you’re not using ISP #2 while you’re waiting for the issue to be resolved.
Secondly, you need to contact your ISP using the information you’ve acquired from Obkio’s app.
- Open a support ticket with your ISP using the screenshots of Monitoring Sessions, Dashboards or Traceroutes in Vision.
- Use Live Monitoring mode for real-time updates and share results of Live Traceroutes with your ISP using a public link.
- If your ISP wants to analyze your data further, you can create a temporary Read-Only User in your Obkio account for them.
Now you’ve just seen some of the most common SD-WAN issues that your network can experience, so you're ready to fight them off!
Remember that SD-WAN issues are inevitable. It’s not about if they happen, it’s about when, how, and where they happen.

To be able to identify and troubleshoot any SD-WAN issues, whether they happen in your network or your ISP’s network, continuously monitor your SD-WAN network using Obkio’s SD-WAN Monitoring tool.
- Monitor your SD-WAN migration
- Continuously monitor SD-WAN performance
- Proactively identify SD-WAN issues anywhere in your network
- Collect the information you need to troubleshoot internally or externally
Get started with Obkio's Free Trial!
