Back to Blog
โ˜…โ˜…โ˜†Intermediate๐ŸŒ WAN / Service Provider
SD-WANViptelaCiscoWANBest PracticesTroubleshooting

Cisco SD-WAN Best Practices: Design, Policy, and Troubleshooting

March 10, 2026ยท11 min read

Overview

Cisco Catalyst SD-WAN (formerly Viptela) separates the WAN into three distinct planes: a centralized control plane managed by vSmart controllers, an orchestration plane handled by vBond, and a management plane through vManage. Edge routers (WAN Edges / vEdges) form encrypted BFD-monitored tunnels across any transport โ€” MPLS, broadband, LTE, or satellite โ€” and apply application-aware routing policies pushed from vManage. When it works, SD-WAN dramatically simplifies WAN operations. When it breaks, the layered architecture means failures can be hard to isolate without a structured approach.


// Cisco SD-WAN โ€” Control, Data, and Orchestration Planes
vManage NMS / Mgmt Plane vBond Orchestrator vSmart Controller / OMP NETCONF DTLS/TLS WAN Edge Site A MPLS + LTE WAN Edge Site B MPLS + Broadband Internet / MPLS IPSec BFD Tunnel OMP OMP Control plane (OMP) is separate from data plane (IPSec) โ€” failures affect each independently

Part 1 โ€” Architecture and Design Principles

1.1 โ€” Understand the Four Components

Every Cisco SD-WAN deployment has four roles. Understanding what each does is essential for troubleshooting:

  • vManage โ€” the single pane of glass. Pushes config templates, policies, and software. All GUI and REST API access goes here. If vManage is down, the overlay continues to run โ€” existing tunnels and policies are unaffected.
  • vBond โ€” the orchestrator. Every WAN Edge contacts vBond first during onboarding to discover vSmart and vManage addresses. vBond must be publicly reachable. After initial onboarding, WAN Edges do not rely on vBond for data plane operation.
  • vSmart โ€” the controller. Runs OMP (Overlay Management Protocol), distributes routes and policies to all WAN Edges. A vSmart failure stops route and policy updates but does not drop existing tunnels.
  • WAN Edge (vEdge/cEdge) โ€” the data plane. Builds IPSec tunnels to all other edges, runs BFD for tunnel health monitoring, and applies local policies pushed by vSmart.

1.2 โ€” Transport Independence

Design your underlay transports as completely independent failure domains:

  • Use at least two transport types (e.g., MPLS + broadband, or MPLS + LTE) โ€” never two circuits from the same provider
  • Assign each transport to a separate color in SD-WAN (e.g., mpls, biz-internet, lte)
  • Use restrict on colors only when you need strict transport separation (e.g., never route voice over LTE)

1.3 โ€” Template-Driven Configuration

Never push ad-hoc CLI to WAN Edges in vManage-managed deployments. Always use feature templates:

  • One Device Template per device type (ISR 1100, ISR 4K, C8000v, etc.)
  • Feature templates for: System, VPN 0 (transport), VPN 512 (management), Service VPNs
  • Use variables ({{hostname}}, {{system-ip}}) for per-device values โ€” keeps templates reusable
  • Attach the same template to all devices of the same role โ€” this enforces consistency and makes auditing straightforward

Part 2 โ€” OMP and Routing Best Practices

OMP (Overlay Management Protocol) is the SD-WAN control plane. It runs between WAN Edges and vSmart over a DTLS/TLS connection and carries three route types: OMP routes (learned prefixes), TLOC routes (transport location endpoints), and service routes.

# Verify OMP sessions on WAN Edge
WAN-Edge# show sdwan omp summary
WAN-Edge# show sdwan omp peers

# Check OMP routes received from vSmart
WAN-Edge# show sdwan omp routes
WAN-Edge# show sdwan omp routes vpn 1

# Check TLOC routes (transport endpoints)
WAN-Edge# show sdwan omp tlocs
WAN-Edge# show sdwan omp tlocs detail

# Verify service-side routes being advertised into OMP
WAN-Edge# show sdwan omp advertised-routes

2.1 โ€” Route Policy Best Practices

  • Always use centralized data policies for application-aware routing โ€” do not use local policies except for edge cases
  • Apply policies to site lists rather than individual devices โ€” policy changes apply consistently across all sites in the list
  • Use SLA classes to define acceptable loss/latency/jitter thresholds per application class, then reference them in AAR policies
  • Keep route policies simple: prefer hub-and-spoke topologies for branch sites, full-mesh only for DC-to-DC

Part 3 โ€” Application-Aware Routing (AAR)

AAR is SD-WAN's primary value proposition โ€” automatically shifting traffic to the best transport based on real-time BFD measurements.

# Check BFD tunnel status and metrics per color
WAN-Edge# show sdwan bfd sessions
WAN-Edge# show sdwan bfd sessions detail

# BFD output shows per-tunnel loss/latency/jitter
# State: up = healthy | down = path failed | NA = not applicable

# Check which path a specific application is using
WAN-Edge# show sdwan policy service-path vpn 1 interface ge0/0 source-ip 10.1.0.10 dest-ip 10.2.0.10 protocol 6 dest-port 443

# View active AAR decisions
WAN-Edge# show sdwan app-route stats
WAN-Edge# show sdwan app-route sla-class

3.1 โ€” SLA Class Design

# Good SLA class design โ€” tiered by application sensitivity:
# Voice/Video: loss < 1%, latency < 150ms, jitter < 30ms
# Critical apps: loss < 2%, latency < 300ms
# Best effort: no SLA โ€” any available path

# Verify SLA class hits
WAN-Edge# show sdwan app-route sla-class name VOICE-SLA

Part 4 โ€” Troubleshooting SD-WAN

Step 1 โ€” Control Plane: vBond Reachability

# Check if WAN Edge can reach vBond (first step for any onboarding issue)
WAN-Edge# show sdwan control connections
WAN-Edge# show sdwan control connection-history

# State should be: vbond=up, vsmart=up, vmanage=up
# If vbond=connecting โ€” check DNS, NAT, firewall (UDP 12346 must be open)

# Verify certificate and organization name
WAN-Edge# show sdwan certificate serial
WAN-Edge# show sdwan certificate validity

Step 2 โ€” Data Plane: BFD and Tunnel Health

WAN-Edge# show sdwan control connections
PEER PEER CONTROLLER PEER PEER PEER SITE DOMAIN PEER GROUP GROUP TYPE PROT SYSTEM IP ID ID PRIVATE IP PORT ID vsmart dtls 10.0.0.1 1 0 198.51.100.10 12346 default up vsmart dtls 10.0.0.1 1 0 198.51.100.11 12346 default up vmanage dtls 10.0.0.2 1 0 203.0.113.20 12346 default up vbond udp 10.0.0.3 0 0 203.0.113.30 12346 0 up WAN-Edge# show sdwan bfd sessions SOURCE TLOC REMOTE TLOC SYSTEM IP SITE ID STATE COLOR IP COLOR ENCAP TRANSITIONS 10.1.0.1 10 up mpls 10.10.1.1 mpls ipsec 0 10.1.0.1 10 up lte 10.10.1.1 lte ipsec 2 10.2.0.1 20 up mpls 10.20.1.1 mpls ipsec 0 10.2.0.1 20 up biz-int 10.20.1.1 biz-int ipsec 7
# Check all BFD sessions โ€” down sessions prevent data plane traffic
WAN-Edge# show sdwan bfd sessions
WAN-Edge# show sdwan bfd summary

# Check tunnel interface status
WAN-Edge# show sdwan interface
WAN-Edge# show sdwan tunnel statistics

# Ping over a specific color/transport
WAN-Edge# ping sdwan 10.2.0.1 vpn 0 source ge0/0

Step 3 โ€” Policy Not Applied

# Check active policies on WAN Edge
WAN-Edge# show sdwan policy from-vsmart
WAN-Edge# show sdwan policy data-policy-filter

# Verify policy counters (shows if traffic is matching policy)
WAN-Edge# show sdwan policy data-policy-filter detail

# On vManage โ€” check policy push status
# Monitor > Devices > [device] > Real-Time > Policy

Step 4 โ€” Service VPN Routing Issues

# Check service VPN routing table
WAN-Edge# show ip route vrf 1
WAN-Edge# show sdwan omp routes vpn 1 detail

# Ping from service VPN
WAN-Edge# ping vrf 1 10.2.0.10 source ge0/2

# Check NAT for DIA (direct internet access) traffic
WAN-Edge# show ip nat translations vrf 1

Quick Reference โ€” Common SD-WAN Issues

SymptomLikely CauseFix
WAN Edge stuck in connecting to vBondUDP 12346 blocked, wrong vBond IP, cert mismatchOpen UDP 12346, verify vBond IP in system config, check certificate validity
OMP session up but no routes receivedSite list / VPN list not matching policy, vSmart policy errorVerify site-list and vpn-list assignments in vManage policy
BFD sessions down on one colorUnderlay transport issue, firewall blocking UDP 12346Verify underlay connectivity, check provider ACLs on that transport
AAR not shifting traffic despite SLA violationSLA class thresholds too tight, no fallback path, polling intervalReview SLA thresholds, ensure backup color is available and BFD is up
Traffic not matching data policyWrong site-list, DSCP not set, app-list not matchingUse `show sdwan policy data-policy-filter` to verify match counters
vManage cannot push templateDevice unreachable via vManage tunnel, cert expiredCheck NETCONF connectivity (TCP 830), verify certificate status
DIA traffic not workingNAT not configured in service VPN, default route missingAdd NAT in VPN 1, verify default route via transport VPN 0

SD-WAN Hardening Checklist

  • All WAN Edges use signed certificates from vManage โ€” never use self-signed in production
  • vBond is deployed in a DMZ and reachable from all transport IPs (UDP 12346)
  • Two vSmart controllers deployed for HA โ€” never rely on a single controller
  • vManage is backed up daily (request nms configuration-db backup)
  • All device templates use variables โ€” no hardcoded values except role-specific config
  • BFD timers are tuned per transport (MPLS: 1sร—6, LTE: 3sร—5)
  • AAR SLA classes defined for voice, critical apps, and best-effort tiers
  • restrict keyword used on colors that should never carry specific traffic types
  • Zero Trust: WAN Edges only accept control connections from known controller IPs