Virtual Routing Redundancy Protocol (VRRP) (Junos-Specific)

In this post, we’ll look at how VRRP on Juniper MX routers keeps your network running smoothly by automatically switching to a backup router if the main one goes down. It’s a simple way to add high availability and avoid unexpected outages.

Definitions

Routing Platform

Any Layer-3 capable device that has the ability to route packets. The SRX series line of firewalls, the MX series of routers, or the QFX and EX line of switches for example.

Virtual IP Address

A single, shared IP address assigned to a group of servers to provide a consistent access point for services, and does not correspond to a single physical network interface.

What is VRRP?

Virtual Routing Redundancy Protocol (VRRP) is a layer-3 redundancy protocol that allows a routing platform to experience a failure event with minimal disruption to services.

VRRP enables clients on a LAN to make use of redundant routing platforms on that same LAN without requiring more than the configuration of a single default gateway on the clients themselves.

At any time, one of the VRRP routing platforms is the primary and the others are backups. If the primary fails, one of the backup routers will become the new primary automatically.

The order in which the primary and backup routers are set is dictated by an assigned priority number. In VRRP the concepts of groups and priorities are important to understand.

Configuration Preparation

I have my two vMXs directly connected and configured for MC-LAG. I have a separate blog post here that you can use to get started for that. They are connected to a downstream virtual EX which has a single virtual server connected to an access port so we can confirm connectivity once complete.

Groups and Priorities

VRRP groups multiple routing platforms into a virtual router. In our lab we have two Juniper vMX routers offering that redundancy.

Because VRRP is configured at the interface level, our vMXs can be members of multiple groups, it is just not recommended to assign an interface to multiple groups.

What I mean by that is we can have these interfaces configured on a router belonging to groups 1, 2, and 3 rather than the router being a member of group 1, 2, or 3. It is not the platform itself that is assigned the group value, just the interface.

Because each interface participating in VRRP is assigned a priority number (1 through 255), this is the total maximum you can configure for any particular group. The primary router for that interface will always be assigned the higher number.

To begin, we will create group 1 and our vMXs will be configured as such:

  • vMX-01
    • Group-id: 1
    • Priority: 254
  • vMX-02
    • Group-id: 1
    • Priority: 253

Let’s set some basic VRRP config for our IRB interfaces and assign them to the group using the commands below:

vMX-01

set interfaces irb unit 10 family inet address 10.0.100.2/24 vrrp-group 1 virtual-address 10.0.100.1
set interfaces irb unit 10 family inet address 10.0.100.2/24 vrrp-group 1 priority 254
set interfaces irb unit 10 family inet address 10.0.100.2/24 vrrp-group 1 accept-data

vMX-02

set interfaces irb unit 10 family inet address 10.0.100.3/24 vrrp-group 1 virtual-address 10.0.100.1
set interfaces irb unit 10 family inet address 10.0.100.3/24 vrrp-group 1 priority 253
set interfaces irb unit 10 family inet address 10.0.100.3/24 vrrp-group 1 accept-data

Take note of what stays consistent and what changes:

  1. The virtual-address remains the same on vMX-01 and vMX-02. This IP address is what client devices will use as their gateway.
  2. The IP address configured on the irb.10 interface for each router will be different. We have given both logical interfaces under the IRB an IP different to that of the VIP.
  3. For participating interfaces, the vrrp-group ID number will remain the same on each platform.
  4. The priority number will change depending on your preference for active and backup devices. If the platforms are the same then this will come down to your own individual preference but you usually want the better platform as the primary.
  5. The accept-data stanza is required so that both routers can accept packets destined for its virtual IP address.

If there is no priority specified, these two routers will dynamically elect primary and backup devices. It is strongly recommended to force the assignment of primary and backup devices using priorities 1 through 255, with 255 being the highest priority.

Our first group uses 254 for the primary member (vMX-01) and 253 for the backup (vMX-02). If you created a second group, you could configure it as such so that vMX-02 is the primary for example. If the IP set on the interface is different than the virtual address, then you will need to start at 254 like we have.

Within a VRRP group, the primary and the backup virtual router must be configured on different routing platforms.

It is possible, though not recommended, for an interface to be a member of multiple VRRP groups. You should only assign one interface per group. For example, irb.10 would belong to group 1 only. If there was a failover event for another group, it would cause this interface to swap ownership to the backup device in that scenario. We want to avoid as many unnecessary failovers as possible as flow sessions may be interrupted under certain conditions.

Tracking and Priority Cost Values

In order for the routing platforms participating in VRRP to detect interface changes related to a failover event. We will need to configure tracking and priority cost for each tracked interface:

vMX-01

set interfaces irb unit 10 family inet address 10.0.100.2/24 vrrp-group 1 track-interface ae0 priority-cost 10

vMX-02

set interfaces irb unit 10 family inet address 10.0.100.3/24 vrrp-group 1 track-interface ae0 priority-cost 10

Now what is tracking, and why did we choose those interfaces?

When you “track” an interface, you are telling the platform that if those interfaces suffer from a failure, trigger a VRRP event and transfer ownership of the group.

I have chosen the ae0 interfaces rather than the member interfaces for both vMXs because the ae0 configuration is configured with the below command:

set interfaces ae0 aggregated-ether-options minimum-links 1

Meaning that one interface can fail in our bundle of two and it will remain up which is the point of the configuration. A failure in both would mean a failure of ae0 itself and that would trigger the VRRP failover event that we desire.

You’ll notice when configuring the interfaces for VRRP that we set a priority cost, but we just configured the priority for irb.10 earlier. Why do we need to configure another priority?

The “cost” value that we assign here is the value that is deducted from the VRRP group’s originally configured priority if a tracked interface suffers a failure event. So in the event of a failure, the router will dynamically lower its priority.

If it didn’t then the original priority we set earlier would remain and ownership would not transfer properly, meaning the failover event could occur but ownership would remain with the primary as its priority is higher.

So originally we have vMX-01 configured with a priority of 254, and vMX-02 configured with a priority of 253.

If vMX-01 suffers from a failure event, the priority will be reduced by 10 according to our configuration stanza above, giving it a new value of 244.

You are not able to configure a value that would reduce the originally configured priority to less than 1. So, if you configure the priority as 254 and make the priority cost a higher number, you will see a failure to commit.

We’ll take a look at the priority cost after some preliminary testing. Now let’s get a baseline with everything up and operational.

VRRP Failover Testing

Before conducting any kind of testing, failover or not, we want to confirm a baseline for what is expected and what is not. For this test we want to confirm:

  • All participating interfaces are up
  • LACP is healthy
  • Bridge Domains/VLANs are configured on all participating interfaces
  • VRRP ownership belongs to vMX-01

vMX-01

root@vmx-01> show interfaces terse ge-0/0/0
Interface               Admin Link Proto    Local                 Remote
ge-0/0/0                up    up
ge-0/0/0.0              up    up   aenet    --> ae0.0
ge-0/0/0.32767          up    up   aenet    --> ae0.32767

root@vmx-01> show interfaces terse ge-0/0/3
Interface               Admin Link Proto    Local                 Remote
ge-0/0/3                up    up
ge-0/0/3.0              up    up   aenet    --> ae0.0
ge-0/0/3.32767          up    up   aenet    --> ae0.32767

root@vmx-01> show interfaces terse ae0
Interface               Admin Link Proto    Local                 Remote
ae0                     up    up
ae0.0                   up    up   bridge
ae0.32767               up    up   multiservice
root@vmx-01> show lacp interfaces
Aggregated interface: ae0
    LACP state:           Role   Exp   Def  Dist  Col  Syn  Aggr  Timeout  Activity
      ge-0/0/0           Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      ge-0/0/0         Partner    No    No   Yes  Yes  Yes   Yes     Fast    Active
      ge-0/0/3           Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      ge-0/0/3         Partner    No    No   Yes  Yes  Yes   Yes     Fast    Active
    LACP protocol:        Receive State  Transmit State          Mux State
      ge-0/0/0                  Current   Fast periodic Collecting distributing
      ge-0/0/3                  Current   Fast periodic Collecting distributing
root@vmx-01> show bridge domain
Routing instance        Bridge domain            VLAN ID     Interfaces
default-switch          bd10                     10
                                                             ae0.0
                                                             ae2.0
root@vmx-01> show vrrp
Jun 05 13:15:04
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   master   Active      A  0.743 lcl    10.0.100.2
                                                                vip    10.0.100.1

vMX-02

root@vmx-02> show interfaces terse ge-0/0/0
Interface               Admin Link Proto    Local                 Remote
ge-0/0/0                up    up
ge-0/0/0.0              up    up   aenet    --> ae0.0
ge-0/0/0.32767          up    up   aenet    --> ae0.32767

root@vmx-02> show interfaces terse ge-0/0/4
Interface               Admin Link Proto    Local                 Remote
ge-0/0/4                up    up
ge-0/0/4.0              up    up   aenet    --> ae0.0
ge-0/0/4.32767          up    up   aenet    --> ae0.32767

root@vmx-02> show interfaces terse ae0
Interface               Admin Link Proto    Local                 Remote
ae0                     up    up
ae0.0                   up    up   bridge
ae0.32767               up    up   multiservice
root@vmx-02> show lacp interfaces
Aggregated interface: ae0
    LACP state:           Role   Exp   Def  Dist  Col  Syn  Aggr  Timeout  Activity
      ge-0/0/4           Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      ge-0/0/4         Partner    No    No   Yes  Yes  Yes   Yes     Fast    Active
      ge-0/0/0           Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      ge-0/0/0         Partner    No    No   Yes  Yes  Yes   Yes     Fast    Active
    LACP protocol:        Receive State  Transmit State          Mux State
      ge-0/0/4                  Current   Fast periodic Collecting distributing
      ge-0/0/0                  Current   Fast periodic Collecting distributing
root@vmx-02> show bridge domain

Routing instance        Bridge domain            VLAN ID     Interfaces
default-switch          bd10                     10
                                                             ae0.0
                                                             ae2.0
root@vmx-02> show vrrp
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   backup   Active      D  2.454 lcl    10.0.100.3
                                                                vip    10.0.100.1
                                                                mas    10.0.100.2

Everything looks good, let’s move on to the next step.

VRRP Failover Testing Step 1:

First we’re going to disable ge-0/0/0 on vMX-01 to see how LACP and VRRP react:

vMX-01

root@vmx-01> edit
Jun 05 12:52:56
Entering configuration mode

[edit]
root@vmx-01# edit interfaces ge-0/0/0
Jun 05 12:53:02

[edit interfaces ge-0/0/0]
root@vmx-01# set disable
Jun 05 12:53:04

[edit interfaces ge-0/0/0]
root@vmx-01# commit
Jun 05 12:53:08
commit complete
root@vmx-01# run show interfaces terse ge-0/0/0
Jun 05 12:53:34
Interface               Admin Link Proto    Local                 Remote
ge-0/0/0                down  down
ge-0/0/0.0              up    down aenet    --> ae0.0
ge-0/0/0.32767          up    down aenet    --> ae0.32767

[edit]
root@vmx-01# run show interfaces terse ae0
Jun 05 12:53:46
Interface               Admin Link Proto    Local                 Remote
ae0                     up    up
ae0.0                   up    up   bridge
ae0.32767               up    up   multiservice
root@vmx-01# run show lacp interfaces
Jun 05 12:54:21
Aggregated interface: ae0
    LACP state:           Role   Exp   Def  Dist  Col  Syn  Aggr  Timeout  Activity
      ge-0/0/0           Actor    No   Yes    No   No   No   Yes     Fast    Active
      ge-0/0/0         Partner    No   Yes    No   No   No   Yes     Fast   Passive
      ge-0/0/3           Actor    No    No   Yes  Yes  Yes   Yes     Fast    Active
      ge-0/0/3         Partner    No    No   Yes  Yes  Yes   Yes     Fast    Active
    LACP protocol:        Receive State  Transmit State          Mux State
      ge-0/0/0            Port disabled     No periodic           Detached
      ge-0/0/3                  Current   Fast periodic Collecting distributing
root@vmx-01# run show vrrp
Jun 05 12:54:25
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   master   Active      A  0.685 lcl    10.0.100.2
                                                                vip    10.0.100.1

Everything looks as we had expected. The physical interface is down, LACP shows the interface as detached, but vMX-01 is still the VRRP master for group 1.

Now let’s confirm everything is appropriate on vMX-02 also before moving on:

vMX-02

root@vmx-02# run show vrrp
Jun 05 12:55:25
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   backup   Active      D  2.922 lcl    10.0.100.3
                                                                vip    10.0.100.1
                                                                mas    10.0.100.2

Let’s reenable the interfaces on vMX-01 now and move onto the next stage of testing:

vMX-01

root@vmx-01# rollback 1
Jun 05 12:57:06
load complete

[edit]
root@vmx-01# show | compare
Jun 05 12:57:13
[edit interfaces ge-0/0/0]
-   disable;

[edit]
root@vmx-01# commit and-quit
Jun 05 12:57:16
commit complete

VRRP Failover Testing Step 2:

We’re now going to do two things and make sure the VRRP mastership for group 1 is transferred to vMX-02 from vMX-01:

  • Disable and reenable both interfaces in the ae0 bundle
  • Disable and reenable the ae0 bundle itself

Disabling the physical interfaces first on vMX-01:

vMX-01

root@vmx-01> edit
Jun 05 12:59:43
Entering configuration mode

[edit]
root@vmx-01# edit interfaces
Jun 05 12:59:45

[edit interfaces]
root@vmx-01# set ge-0/0/0 disable
Jun 05 12:59:49

[edit interfaces]
root@vmx-01# set ge-0/0/3 disable
Jun 05 12:59:53

[edit interfaces]
root@vmx-01# commit
Jun 05 12:59:56
commit complete
root@vmx-01# run show interfaces terse ge-0/0/0
Jun 05 13:01:12
Interface               Admin Link Proto    Local                 Remote
ge-0/0/0                down  down
ge-0/0/0.0              up    down aenet    --> ae0.0
ge-0/0/0.32767          up    down aenet    --> ae0.32767

[edit interfaces]
root@vmx-01# run show interfaces terse ge-0/0/3
Jun 05 13:01:14
Interface               Admin Link Proto    Local                 Remote
ge-0/0/3                down  down
ge-0/0/3.0              up    down aenet    --> ae0.0
ge-0/0/3.32767          up    down aenet    --> ae0.32767

[edit interfaces]
root@vmx-01# run show interfaces terse ae0
Jun 05 13:01:17
Interface               Admin Link Proto    Local                 Remote
ae0                     up    down
ae0.0                   up    down bridge
ae0.32767               up    down multiservice
root@vmx-01# run show lacp interfaces
Jun 05 13:01:38
Aggregated interface: ae0
    LACP state:           Role   Exp   Def  Dist  Col  Syn  Aggr  Timeout  Activity
      ge-0/0/0           Actor    No   Yes    No   No   No   Yes     Fast    Active
      ge-0/0/0         Partner    No   Yes    No   No   No   Yes     Fast   Passive
      ge-0/0/3           Actor    No   Yes    No   No   No   Yes     Fast    Active
      ge-0/0/3         Partner    No   Yes    No   No   No   Yes     Fast   Passive
    LACP protocol:        Receive State  Transmit State          Mux State
      ge-0/0/0            Port disabled     No periodic           Detached
      ge-0/0/3            Port disabled     No periodic           Detached
root@vmx-01# run show vrrp
Jun 05 13:01:49
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   backup   Active      D  2.927 lcl    10.0.100.2
                                                                vip    10.0.100.1
                                                                mas    10.0.100.3
root@vmx-01# run show log messages | match vrrpd
Jun 05 13:02:01
Jun  5 12:59:59  vmx-01 vrrpd[6626]: VRRPD_NEW_BACKUP: Interface irb.10 (local address 10.0.100.2) became VRRP backup for group 1

As hoped, VRRP membership has changed to vMX-02 but let’s confirm on the device itself:

vMX-02

root@vmx-02> show vrrp
Jun 05 13:02:42
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   master   Active      A  0.516 lcl    10.0.100.3
                                                                vip    10.0.100.1
root@vmx-02> show log messages | match vrrpd
Jun 05 13:02:58
Jun  5 12:59:59  vmx-02 vrrpd[6642]: VRRPD_NEW_MASTER: Interface irb.10 (local address 10.0.100.3) became VRRP master for group 1 with master reason notMaster

OK let’s reenable the physical interfaces and confirm that vMX-01 becomes the master once again:

vMX-01

root@vmx-01# rollback 1
Jun 05 13:03:29
load complete

[edit]
root@vmx-01# show | compare
Jun 05 13:03:31
[edit interfaces ge-0/0/0]
-   disable;
[edit interfaces ge-0/0/3]
-   disable;

[edit]
root@vmx-01# commit
Jun 05 13:03:32
commit complete
root@vmx-01# run show vrrp
Jun 05 13:03:44
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   master   Active      A  0.721 lcl    10.0.100.2
                                                                vip    10.0.100.1
root@vmx-01# run show log messages | match vrrpd
Jun 05 13:04:02
Jun  5 12:59:59  vmx-01 vrrpd[6626]: VRRPD_NEW_BACKUP: Interface irb.10 (local address 10.0.100.2) became VRRP backup for group 1
Jun  5 13:03:37  vmx-01 vrrpd[6626]: VRRPD_NEW_MASTER: Interface irb.10 (local address 10.0.100.2) became VRRP master for group 1 with master reason notMaster

Everything looks appropriate. Now let’s disable ae0:

root@vmx-01> edit
Jun 05 13:04:32
Entering configuration mode

[edit]
root@vmx-01# edit interfaces
Jun 05 13:04:33

[edit interfaces]
root@vmx-01# set ae0 disable
Jun 05 13:04:36

[edit interfaces]
root@vmx-01# commit
Jun 05 13:04:40
commit complete
root@vmx-01# run show interfaces terse ge-0/0/0
Jun 05 13:05:04
Interface               Admin Link Proto    Local                 Remote
ge-0/0/0                down  down
ge-0/0/0.0              up    down aenet    --> ae0.0
ge-0/0/0.32767          up    down aenet    --> ae0.32767

[edit]
root@vmx-01# run show interfaces terse ge-0/0/3
Jun 05 13:05:06
Interface               Admin Link Proto    Local                 Remote
ge-0/0/3                down  down
ge-0/0/3.0              up    down aenet    --> ae0.0
ge-0/0/3.32767          up    down aenet    --> ae0.32767

[edit]
root@vmx-01# run show interfaces terse ae0
Jun 05 13:05:09
Interface               Admin Link Proto    Local                 Remote
ae0                     down  down
ae0.0                   up    down bridge
ae0.32767               up    down multiservice
root@vmx-01# run show lacp interfaces
Jun 05 13:05:22
Aggregated interface: ae0
    LACP state:           Role   Exp   Def  Dist  Col  Syn  Aggr  Timeout  Activity
      ge-0/0/0           Actor    No   Yes    No   No   No   Yes     Fast    Active
      ge-0/0/0         Partner    No   Yes    No   No   No   Yes     Fast   Passive
      ge-0/0/3           Actor    No   Yes    No   No   No   Yes     Fast    Active
      ge-0/0/3         Partner    No   Yes    No   No   No   Yes     Fast   Passive
    LACP protocol:        Receive State  Transmit State          Mux State
      ge-0/0/0            Port disabled     No periodic           Detached
      ge-0/0/3            Port disabled     No periodic           Detached
root@vmx-01# run show vrrp
Jun 05 13:05:33
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   backup   Active      D  2.495 lcl    10.0.100.2
                                                                vip    10.0.100.1
                                                                mas    10.0.100.3
root@vmx-01# run show log messages | match vrrpd
Jun 05 13:05:48
Jun  5 12:59:59  vmx-01 vrrpd[6626]: VRRPD_NEW_BACKUP: Interface irb.10 (local address 10.0.100.2) became VRRP backup for group 1
Jun  5 13:03:37  vmx-01 vrrpd[6626]: VRRPD_NEW_MASTER: Interface irb.10 (local address 10.0.100.2) became VRRP master for group 1 with master reason notMaster
Jun  5 13:04:42  vmx-01 vrrpd[6626]: VRRPD_NEW_BACKUP: Interface irb.10 (local address 10.0.100.2) became VRRP backup for group 1

Now let’s reenable and confirm mastership for group 1 transfers back to vMX-01:

root@vmx-01# rollback 1
Jun 05 13:06:12
load complete

[edit]
root@vmx-01# show | compare
Jun 05 13:06:17
[edit interfaces ae0]
-   disable;

[edit]
root@vmx-01# commit
Jun 05 13:06:22
commit complete
root@vmx-01# run show vrrp
Jun 05 13:06:46
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   master   Active      A  0.567 lcl    10.0.100.2
                                                                vip    10.0.100.1
root@vmx-01# run show log messages | match vrrpd
Jun 05 13:07:04
Jun  5 12:59:59  vmx-01 vrrpd[6626]: VRRPD_NEW_BACKUP: Interface irb.10 (local address 10.0.100.2) became VRRP backup for group 1
Jun  5 13:03:37  vmx-01 vrrpd[6626]: VRRPD_NEW_MASTER: Interface irb.10 (local address 10.0.100.2) became VRRP master for group 1 with master reason notMaster
Jun  5 13:04:42  vmx-01 vrrpd[6626]: VRRPD_NEW_BACKUP: Interface irb.10 (local address 10.0.100.2) became VRRP backup for group 1
Jun  5 13:06:27  vmx-01 vrrpd[6626]: VRRPD_NEW_MASTER: Interface irb.10 (local address 10.0.100.2) became VRRP master for group 1 with master reason notMaster

So to recap the failover behavior:

  • For vMX-01 we have configured ge-0/0/0 and ge-0/0/3 to be members of ae0
  • In the ae0 configuration we have configured the stanza minimum-links 1
  • Disabling only one link in the ae bundle will not cause a failover because we are only tracking the ae0 interface in the VRRP configuration
  • Disabling both interfaces causes a failover event
  • Disabling the ae0 interface causes a failover event
  • We confirmed with show commands and log messages

So now that we can confirm that our VRRP configuration is solid and tested under different conditions and we understand what triggers a failover event, we can start to look at ways we can fine-tune this even further and understand what the additional VRRP commands can do for us and when we may want to use them.

As an important note to consider, VRRP does not support session synchronization between members so any existing sessions will be dropped on the backup device as out-of-state. This is dependent on the timers we’ll configure later.

For now let’s take a closer look at the priority-cost stanza we configured earlier to see exactly how membership dynamically changes once a failover event occurs.

First, let’s go back to our baseline:
vMX-01

root@vmx-01> show vrrp detail
Physical interface: irb, Unit: 10, Address: 10.0.100.2/24
  Index: 330, SNMP ifIndex: 548, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1
  Advertisement Timer: 0.565s, Master router: 10.0.100.2
  Virtual router uptime: 12:13:40, Master router uptime: 00:28:18
  Virtual Mac: 00:00:5e:00:01:01
  Preferred: yes
  Tracking: enabled
    Current priority: 254, Configured priority: 254
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Interface     Int state   Int speed   Incurred priority cost
      ae0.0         up                 2g                       0
    Route tracking: disabled
    BFD tracking: disabled

I’ve bolded the info we need to take note of for now. Specifically, take a look at the below:

  • Current priority: 254
  • Configured priority: 254
  • Incurred priority cost: 0

Let’s disable ae0 once more and re-run the command to see what changes:

root@vmx-01> configure
Jun 05 13:41:37
Entering configuration mode

[edit]
root@vmx-01# edit interfaces ae0
Jun 05 13:41:42

[edit interfaces ae0]
root@vmx-01# set disable
Jun 05 13:41:44

[edit interfaces ae0]
root@vmx-01# commit and-quit
Jun 05 13:41:48
commit complete
root@vmx-01> show vrrp detail
Jun 05 13:42:07
Physical interface: irb, Unit: 10, Address: 10.0.100.2/24
  Index: 330, SNMP ifIndex: 548, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: backup, VRRP Mode: Active
  Priority: 244, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1
  Dead timer: 2.569s, Master priority: 253, Master router: 10.0.100.3
  Virtual router uptime: 12:21:02
  Preferred: yes
  Tracking: enabled
    Current priority: 244, Configured priority: 254
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Interface     Int state   Int speed   Incurred priority cost
      ae0.0         down                0                      10
    Route tracking: disabled
    BFD tracking: disabled

I bolded the same key pieces of info from the output so let’s take a look at the difference:

  • Current priority: 244
  • Configured priority: 254
  • Incurred priority cost: 10

The difference here now is that upon the failover event, we have incurred a cost of 10 that is automatically deducted from our configured priority as a result of what we defined earlier when specifying the tracking and priority-cost.

This is how the ownership is transferred and the VRRP group members decide on the primary/backup relationship.

Let’s rollback our changes and take a look now:

root@vmx-01> configure
Jun 05 13:46:12
Entering configuration mode

[edit]
root@vmx-01# rollback 1
Jun 05 13:46:15
load complete
root@vmx-01> show vrrp detail
Jun 05 13:46:39
Physical interface: irb, Unit: 10, Address: 10.0.100.2/24
  Index: 330, SNMP ifIndex: 548, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1
  Advertisement Timer: 0.316s, Master router: 10.0.100.2
  Virtual router uptime: 12:25:34, Master router uptime: 00:00:16
  Virtual Mac: 00:00:5e:00:01:01
  Preferred: yes
  Tracking: enabled
    Current priority: 254, Configured priority: 254
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Interface     Int state   Int speed   Incurred priority cost
      ae0.0         up                 2g                       0
    Route tracking: disabled
    BFD tracking: disabled

Everything has returned to our initial baseline defined earlier.

Advertising Intervals, Thresholds, and Failover Delays

Advertising Intervals

In VRRP, only the device that is acting as the primary sends out VRRP advertisements. The backup device does not send any advertisement until and unless they take over primary role.

In a working VRRP configuration, lets take a look at those VRRP messages for irb.10:

vMX-01

root@vmx-01> show vrrp interface irb.10
Jun 08 17:01:26
Interface: irb.10, Interface index :332, Groups: 1, Active :1
  Interface VRRP PDU statistics
    Advertisement sent                       :20
    Advertisement received                   :0
    Packets received                         :0
    No group match received                  :0
  Interface VRRP PDU error statistics
    Invalid IPAH next type received          :0
    Invalid VRRP TTL value received          :0
    Invalid VRRP version received            :0
    Invalid VRRP PDU type received           :0
    Invalid VRRP authentication type received:0
    Invalid VRRP IP count received           :0
    Invalid VRRP checksum received           :0
root@vmx-01> monitor traffic interface irb.10 no-resolve
Jun 08 17:06:09
verbose output suppressed, use <detail> or <extensive> for full protocol decode
Address resolution is OFF.
Listening on irb.10, capture size 96 bytes

17:06:10.924848 Out IP truncated-ip - 2 bytes missing! 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1
17:06:11.894885 Out IP truncated-ip - 2 bytes missing! 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1
17:06:12.879860 Out IP truncated-ip - 2 bytes missing! 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1
17:06:13.769895 Out IP truncated-ip - 2 bytes missing! 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1
17:06:14.630233 Out IP truncated-ip - 2 bytes missing! 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1

vMX-02

root@vmx-02> show vrrp interface irb.10
Jun 08 17:01:22
Interface: irb.10, Interface index :332, Groups: 1, Active :1
  Interface VRRP PDU statistics
    Advertisement sent                       :0
    Advertisement received                   :17
    Packets received                         :17
    No group match received                  :0
  Interface VRRP PDU error statistics
    Invalid IPAH next type received          :0
    Invalid VRRP TTL value received          :0
    Invalid VRRP version received            :0
    Invalid VRRP PDU type received           :0
    Invalid VRRP authentication type received:0
    Invalid VRRP IP count received           :0
    Invalid VRRP checksum received           :0
root@vmx-02> monitor traffic interface irb.10 no-resolve
Jun 08 17:06:37
verbose output suppressed, use <detail> or <extensive> for full protocol decode
Address resolution is OFF.
Listening on irb.10, capture size 96 bytes

17:06:37.375229  In IP 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1
17:06:38.219602  In IP 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1
17:06:39.015535  In IP 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1
17:06:39.785005  In IP 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1
17:06:40.543012  In IP 10.0.100.2 > 224.0.0.18: VRRPv2-advertisement 20: vrid=1 prio=254 authtype=none intvl=1

We have a discrepancy because of the times the commands were run but the output above shows that vMX-01 is the router sending the messages and vMX-02 is receiving them (you can tell on the monitor traffic commands the advertisements on vMX-01 are show as out whereas vMX-02 shows them as coming in). Notice that each one does not do the other. vMX-01 does not receive, and vMX-02 does not send. If you see both routers sending then there is an issue with the mastership process.

The default advertisement interval is 1 second meaning the primary will send advertisement messages every second to the backup router.

vMX-01

root@vmx-01> show vrrp detail
Jun 05 13:46:39
Physical interface: irb, Unit: 10, Address: 10.0.100.2/24
  Index: 330, SNMP ifIndex: 548, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1
  Advertisement Timer: 0.316s, Master router: 10.0.100.2
  Virtual router uptime: 12:25:34, Master router uptime: 00:00:16
  Virtual Mac: 00:00:5e:00:01:01
  Preferred: yes
  Tracking: enabled
    Current priority: 254, Configured priority: 254
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Interface     Int state   Int speed   Incurred priority cost
      ae0.0         up                 2g                       0
    Route tracking: disabled
    BFD tracking: disabled

Lets run the same command a couple of times and take a look at those advertisement timers changing to confirm:

root@vmx-01> show vrrp detail | match "Advertisement"
Jun 05 13:58:06
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Advertisement Timer: 0.064s, Master router: 10.0.100.2

root@vmx-01> show vrrp detail | match "Advertisement"
Jun 05 13:58:07
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Advertisement Timer: 0.652s, Master router: 10.0.100.2

Advertisement Threshold

The advertisement threshold is the number of missed advertisements from the master router before the backup router takes over. It is calculated using the advertisement timer. The default threshold is three, so if the default advertisement timer is 1 second, it will be 3 seconds before the backup router becomes the primary.

vMX-01

root@vmx-01> show vrrp detail
Jun 05 13:46:39
Physical interface: irb, Unit: 10, Address: 10.0.100.2/24
  Index: 330, SNMP ifIndex: 548, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1
  Advertisement Timer: 0.316s, Master router: 10.0.100.2
  Virtual router uptime: 12:25:34, Master router uptime: 00:00:16
  Virtual Mac: 00:00:5e:00:01:01
  Preferred: yes
  Tracking: enabled
    Current priority: 254, Configured priority: 254
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Interface     Int state   Int speed   Incurred priority cost
      ae0.0         up                 2g                       0
    Route tracking: disabled
    BFD tracking: disabled

Taking a step back to our earlier failover testing, let’s take a look and make sure that is the case.

  • We disabled ge-0/0/0 and ge-0/0/3 on vMX-01 at 12:59:56
root@vmx-01> edit
Jun 05 12:59:43
Entering configuration mode

[edit]
root@vmx-01# edit interfaces
Jun 05 12:59:45

[edit interfaces]
root@vmx-01# set ge-0/0/0 disable
Jun 05 12:59:49

[edit interfaces]
root@vmx-01# set ge-0/0/3 disable
Jun 05 12:59:53

[edit interfaces]
root@vmx-01# commit
Jun 05 12:59:56
commit complete
  • We then checked the logs on vMX-02 to confirm the new group ownership took effect at 12:59:59
root@vmx-01# run show log messages | match vrrpd
Jun 05 13:02:01
Jun  5 12:59:59  vmx-01 vrrpd[6626]: VRRPD_NEW_BACKUP: Interface irb.10 (local address 10.0.100.2) became VRRP backup for group 1

From the time it took for the first commit to complete at 12:59:56 and the log message on the backup router at 12:59:59, we can confirm that the advertising interval of 1 second and the advertising threshold of 3 meant a failover time of 3 seconds.

Failover Delay

Now I’m going to lift some info in the next section from Juniper directly because they explain the failover delay process step-by-step that is better not rewritten, I’ll put the link at the bottom.

A fast failover requires a short delay. Thus, the failover-delay stanza delays the new primary sending gratuitous ARP replies for the period set. This allows the new primary to send the ARP replies for all of the VRRP groups simultaneously instead of one-by-one.

Junos supports a range of 50 through 100000 milliseconds for delay in failover time.

Because we have configured only the one group for testing this does not matter too much to us. In production, you’re going to have multiple interfaces and groups configured for VRRP.

Now depending on the fault tolerance in your environment or the size, you may want to set a failover-delay timer.

Let’s set our failover-delay timer to 100000ms (or 10s):

vMX-01

root@vmx-01# set protocols vrrp failover-delay ?
Jun 05 16:09:13
Possible completions:
  <failover-delay>     Additional failover delay timer (50..100000 milliseconds)
[edit]
root@vmx-01# set protocols vrrp failover-delay 100000
Jun 05 16:09:25

[edit]
root@vmx-01# commit

Because the new primary has to send gratuitous ARP replies on behalf of each group, If you increase the number of groups and change the advertisement timer, you will see a longer period of time for traffic to failover efficiently.

This is because the ARP table for other network devices being routed through the devices may not be updated in time.

If a device that does not receive the gratuitous ARP reply attempts to forward traffic, that traffic will be dropped until the device receives the new ARP reply and learns the MAC of the new primary platform.

The VRRP process (vrrpd) running on the Routing Engine communicates a VRRP primary role change to the Packet Forwarding Engine for every VRRP session.

Each VRRP group can trigger such communication to update the Packet Forwarding Engine with its own state or the state inherited from an active VRRP group.

To avoid overloading the Packet Forwarding Engine with such messages, you configure the failover-delay to specify the time between subsequent Routing Engine to Packet Forwarding Engine communications.

The following sections elaborate the Routing Engine to Packet Forwarding Engine communication in two scenarios:

  • When failover-delay is not configured
  • When failover-delay is configured

When Failover Delay is not Configured:

Without failover-delay configured, the sequence of events for VRRP sessions operated from the Routing Engine is as follows:

  1. When the first VRRP group detected by the Routing Engine changes state, and the new state is primary, the Routing Engine generates appropriate VRRP announcement messages. The Packet Forwarding Engine is informed about the state change, so that hardware filters for that group are reprogrammed without delay. The new primary then sends gratuitous ARP message to the VRRP groups.
  2. The delay in failover timer starts. By default, failover-delay timer is:
    • 500 milliseconds when the configured VRRP announcement interval is less than 1 second.
    • 2 seconds when the configured VRRP announcement interval is 1 second or more, and the total number of VRRP groups on the router is 255.
    • 10 seconds when the configured VRRP announcement interval is 1 second or more, and the number of VRRP groups on the router is more than 255.
  3. The Routing Engine performs one-by-one state change for subsequent VRRP groups. Every time there is a state change, and the new state for a particular VRRP group is primary, the Routing Engine generates appropriate VRRP announcement messages. However, communication toward the Packet Forwarding Engine is suppressed until the failover-delay timer expires.
  4. After failover-delay timer expires, the Routing Engine sends messages to the Packet Forwarding Engine about all VRRP groups that managed to change the state. As a consequence, hardware filters for those groups are reprogrammed, and for those groups whose new state is primary, gratuitous ARP messages are sent.

This process repeats until state transition for all VRRP groups is complete.

Thus, without configuring failover-delay, the full state transition (including states on the Routing Engine and the Packet Forwarding Engine) for the first VRRP group is performed immediately, while state transition on the Packet Forwarding Engine for remaining VRRP groups is delayed by at least 0.5-10 seconds, depending on the configured VRRP announcement timers and the number of VRRP groups.

During this intermediate state, receiving traffic for VRRP groups for state changes that were not yet completed on the Packet Forwarding Engine might be dropped at the Packet Forwarding Engine level due to deferred reconfiguration of hardware filters.

When Failover Delay is Configured:

When failover-delay is configured, the sequence of events for VRRP sessions operated from the Routing Engine is modified as follows:

  1. The Routing Engine detects that some VRRP groups require a state change.
  2. The failover-delay starts for the period configured. The allowed failover-delay timer range is 50 through 100000 milliseconds.
  3. The Routing Engine performs one-by-one state change for the VRRP groups. Every time there is a state change, and the new state for a particular VRRP group is primary, the Routing Engine generates appropriate VRRP announcement messages. However, communication toward the Packet Forwarding Engine is suppressed until the failover-delay timer expires.
  4. After failover-delay timer expires, the Routing Engine sends message to the Packet Forwarding Engine about all VRRP groups that managed to change the state. As a consequence, hardware filters for those groups are reprogrammed, and for those groups whose new state is primary, gratuitous ARP messages are sent.

This process repeats until state transition for all VRRP groups is complete.

Thus, when failover-delay is configured even the Packet Forwarding Engine state for the first VRRP group is deferred. However, the network operator has the advantage of configuring a failover-delay value that best suits the need of the network deployment to ensure minimal outage during VRRP state change.

The failover-delay stanza influences only VRRP sessions operated by the VRRP process (vrrpd) running on the Routing Engine. For VRRP sessions distributed to the Packet Forwarding Engine, failover-delay configuration has no effect.

For the most part, that is the basics of manipulating VRRP, there are a few other configuration steps that you can choose depending on preference, such as preempt, authentication, and route-tracking.

Virtual MAC Address and ARP Table

On our downstream switch what does the arp table look like for the two routers participating in VRRP? Our downstream switch is configured with a single ae interface connected to both virtual MXs.

In VRRP, the MAC addresses representing the VIP range from 00:00:5e:00:01:00 through 00:00:5e:00:01:ff. The VRRP group number must be the decimal equivalent of the last hexadecimal byte of the virtual MAC address.

What does that mean?

Well we configured our VRRP group 1 earlier, and if you take a look at the output below, you’ll see the following Virtual MAC entry: 00:00:5e:00:01:01.

vMX-01

root@vmx-01> show vrrp detail 
Jun 07 03:51:02
Physical interface: irb, Unit: 10, Address: 10.0.100.2/24
  Index: 330, SNMP ifIndex: 524, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1         
  Advertisement Timer: 0.683s, Master router: 10.0.100.2
  Virtual router uptime: 00:32:32, Master router uptime: 00:10:26
  Virtual Mac: 00:00:5e:00:01:01
  Preferred: yes 
  Tracking: enabled 
    Current priority: 254, Configured priority: 254 
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1  
      Interface     Int state   Int speed   Incurred priority cost
      ae0.0         up                 2g                       0
    Route tracking: disabled
    BFD tracking: disabled

Now if we were to configure a second group and use a group id of 2, then we should see the following 00:00:5e:00:01:02.

vMX-01

set interfaces irb unit 20 family inet address 10.0.200.2/24 arp 10.0.200.3 l2-interface ge-0/0/2.0
set interfaces irb unit 20 family inet address 10.0.200.2/24 vrrp-group 2 virtual-address 10.0.200.1
set interfaces irb unit 20 family inet address 10.0.200.2/24 vrrp-group 2 priority 254
set interfaces irb unit 20 family inet address 10.0.200.2/24 vrrp-group 2 accept-data
set interfaces irb unit 20 family inet address 10.0.200.2/24 vrrp-group 2 track interface ae0 priority-cost 10
set bridge-domains bd20 domain-type bridge
set bridge-domains bd20 vlan-id 20
set bridge-domains bd20 service-id 20
set bridge-domains bd20 routing-interface irb.20

vMX-02

set interfaces irb unit 20 family inet address 10.0.200.3/24 arp 10.0.200.2 l2-interface ge-0/0/2.0
set interfaces irb unit 20 family inet address 10.0.200.3/24 vrrp-group 2 virtual-address 10.0.200.1
set interfaces irb unit 20 family inet address 10.0.200.3/24 vrrp-group 2 priority 253
set interfaces irb unit 20 family inet address 10.0.200.3/24 vrrp-group 2 accept-data
set interfaces irb unit 20 family inet address 10.0.200.3/24 vrrp-group 2 track interface ae0 priority-cost 10
set bridge-domains bd20 domain-type bridge
set bridge-domains bd20 vlan-id 20
set bridge-domains bd20 service-id 20
set bridge-domains bd20 routing-interface irb.20

Here’s our first command again and we can see this time there is a new entry in the output for the second group:

vMX-01

root@vmx-01> show vrrp detail 
Jun 07 03:52:57
Physical interface: irb, Unit: 10, Address: 10.0.100.2/24
  Index: 330, SNMP ifIndex: 524, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1         
  Advertisement Timer: 0.683s, Master router: 10.0.100.2
  Virtual router uptime: 00:32:32, Master router uptime: 00:10:26
  Virtual Mac: 00:00:5e:00:01:01
  Preferred: yes 
  Tracking: enabled 
    Current priority: 254, Configured priority: 254 
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1  
      Interface     Int state   Int speed   Incurred priority cost
      ae0.0         up                 2g                       0
    Route tracking: disabled
    BFD tracking: disabled

Physical interface: irb, Unit: 20, Address: 10.0.200.2/24
  Index: 328, SNMP ifIndex: 556, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 2, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.200.1         
  Advertisement Timer: 0.155s, Master router: 10.0.200.2
  Virtual router uptime: 00:16:53, Master router uptime: 00:10:26
  Virtual Mac: 00:00:5e:00:01:02
  Preferred: yes 
  Tracking: enabled 
    Current priority: 254, Configured priority: 254 
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1  
      Interface     Int state   Int speed   Incurred priority cost
      ae0.0         up                 2g                       0
    Route tracking: disabled
    BFD tracking: disabled

We haven’t yet taken a look at our downstream switch vEX-01 for what the MAC table looks like there. Since a virtual MAC address is used in VRRP, we should see both of those in the switching table originating from vMX-01. The first output is when we only configured the one VRRP group:

vEX-01

root@vEX-01> show ethernet-switching table    
Jun 07 07:37:35

MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static, C - Control MAC
           SE - statistics enabled, NM - non configured MAC, R - remote PE MAC, O - ovsdb MAC
           GBP - group based policy, B - Blocked MAC)


Ethernet switching table : 2 entries, 2 learned
Routing instance : default-switch
    Vlan                MAC                 MAC         Age   GBP     Logical                NH        MAC        RTR
    name                address             flags             Tag     interface              Index     property   ID
    bd10                00:00:5e:00:01:01   D             -           ae1.0                  0                    0       

Here’s the same table but after creating the second VRRP group.
vEX-01:

root@vEX-01> show ethernet-switching table    
Jun 07 07:45:35

MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static, C - Control MAC
           SE - statistics enabled, NM - non configured MAC, R - remote PE MAC, O - ovsdb MAC
           GBP - group based policy, B - Blocked MAC)


Ethernet switching table : 3 entries, 3 learned
Routing instance : default-switch
    Vlan                MAC                 MAC         Age   GBP     Logical                NH        MAC        RTR
    name                address             flags             Tag     interface              Index     property   ID
    bd20                00:00:5e:00:01:02   D             -           ae1.0                  0                    0       
    bd10                00:00:5e:00:01:01   D             -           ae1.0                  0                    0       

Because vMX-01 is the primary VRRP member right now, we only see the ae1 logical interface in the switching table. If we failover to vMX-02, we should see the interface change to ae2:

vMX-01

root@vmx-01> configure 
Jun 07 14:36:39
Entering configuration mode

[edit]
root@vmx-01# edit interfaces ae0 
Jun 07 14:36:42

[edit interfaces ae0]
root@vmx-01# set disable 
Jun 07 14:36:45

[edit interfaces ae0]
root@vmx-01# commit 
Jun 07 14:36:47
commit complete
root@vmx-01> show vrrp 
Jun 07 14:37:38
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   backup   Active      D  2.176 lcl    10.0.100.2     
                                                                vip    10.0.100.1     
                                                                mas    10.0.100.3     
irb.20        up              2   backup   Active      D  2.308 lcl    10.0.200.2     
                                                                vip    10.0.200.1     
                                                                mas    10.0.200.3
root@vmx-01> show log messages | match vrrpd 
Jun 07 14:38:14
Jun  7 14:36:53  vmx-01 vrrpd[6557]: VRRPD_NEW_BACKUP: Interface irb.20 (local address 10.0.200.2) became VRRP backup for group 2
Jun  7 14:36:53  vmx-01 vrrpd[6557]: VRRPD_NEW_BACKUP: Interface irb.10 (local address 10.0.100.2) became VRRP backup for group 1

vMX-02

root@vmx-02> show vrrp 
Jun 07 14:37:50
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   master   Active      A  0.132 lcl    10.0.100.3     
                                                                vip    10.0.100.1     
irb.20        up              2   master   Active      A  0.346 lcl    10.0.200.3     
                                                                vip    10.0.200.1
root@vmx-02> show log messages | match vrrpd 
Jun 07 14:38:42
Jun  7 14:36:53  vmx-02 vrrpd[6570]: VRRPD_NEW_MASTER: Interface irb.20 (local address 10.0.200.3) became VRRP master for group 2 with master reason notMaster
Jun  7 14:36:53  vmx-02 vrrpd[6570]: VRRPD_NEW_MASTER: Interface irb.10 (local address 10.0.100.3) became VRRP master for group 1 with master reason notMaster

vEX-01

root@vEX-01> show ethernet-switching table    
Jun 07 18:39:14

MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static, C - Control MAC
           SE - statistics enabled, NM - non configured MAC, R - remote PE MAC, O - ovsdb MAC
           GBP - group based policy, B - Blocked MAC)


Ethernet switching table : 3 entries, 3 learned
Routing instance : default-switch
    Vlan                MAC                 MAC         Age   GBP     Logical                NH        MAC        RTR
    name                address             flags             Tag     interface              Index     property   ID
    bd10                00:00:5e:00:01:01   D             -           ae2.0                  0                    0            
    bd20                00:00:5e:00:01:02   D             -           ae2.0                  0                    0
root@vEX-01> show ethernet-switching mac-learning-log 
Jun 07 18:39:33
Sat Jun 07 18:36:52 vlan_name bd20+20 mac 00:00:5e:00:01:02 was moved from ae1.0 to ae2.0 with flags: 0x2101f
Sat Jun 07 18:36:52 vlan_name bd10+10 mac 00:00:5e:00:01:01 was moved from ae1.0 to ae2.0 with flags: 0x2101f

We can see that the new interface was learned at the same time for both groups at 18:36:52 which is a result of that gratuitous arp sent once the VRRP member changes.

Mastership Errors

A scenario can exist where both routers think they are the VRRP master, and it looks something like this:

vMX-01

root@vmx-01> show vrrp
Jun 08 13:00:04
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   master   Active      A  0.794 lcl    10.0.100.2
                                                                vip    10.0.100.1
irb.20        up              2   master   Active      A  0.249 lcl    10.0.200.2
                                                                vip    10.0.200.1

vMX-02

root@vmx-02> show vrrp
Jun 08 13:00:43
Interface     State       Group   VR state VR Mode   Timer    Type   Address
irb.10        up              1   master   Active      A  0.695 lcl    10.0.100.3
                                                                vip    10.0.100.1
irb.20        up              2   master   Active      A  0.131 lcl    10.0.200.3
                                                                vip    10.0.200.1

Remember only the primary should be sending advertisement messages, lets confirm by taking a look at the statistics:

vMX-01

root@vmx-01> clear vrrp all
Jun 08 13:05:06
vrrpd: cleared stats: irb.10
vrrpd: cleared stats: irb.20
root@vmx-01> show vrrp interface irb.10
Jun 08 13:07:25
Interface: irb.10, Interface index :332, Groups: 1, Active :1
  Interface VRRP PDU statistics
    Advertisement sent                       :157
    Advertisement received                   :0
    Packets received                         :0
    No group match received                  :0
  Interface VRRP PDU error statistics
    Invalid IPAH next type received          :0
    Invalid VRRP TTL value received          :0
    Invalid VRRP version received            :0
    Invalid VRRP PDU type received           :0
    Invalid VRRP authentication type received:0
    Invalid VRRP IP count received           :0
    Invalid VRRP checksum received           :0

Physical interface: irb, Unit: 10, Address: 10.0.100.2/24
  Index: 332, SNMP ifIndex: 524, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1
  Advertisement Timer: 0.019s, Master router: 10.0.100.2
  Virtual router uptime: 00:15:04, Master router uptime: 00:14:59
  Virtual Mac: 00:00:5e:00:01:01
  Preferred: yes
  Tracking: enabled
    Current priority: 254, Configured priority: 254
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Tracked interface: ae0.0
        Interface state: up Speed: 2g
        Incurred priority cost: 0
        Threshold   Priority cost   Active
        down                   10
    Route tracking: disabled
    BFD tracking: disabled
  Group VRRP PDU statistics
    Advertisement sent                       :157
    Advertisement received                   :0
  Group VRRP PDU error statistics
    Bad authentication Type received         :0
    Bad password received                    :0
    Bad MD5 digest received                  :0
    Bad advertisement timer received         :0
    Bad VIP count received                   :0
    Bad VIPADDR received                     :0
  Group state transition statistics
    Idle to master transitions               :0
    Idle to backup transitions               :0
    Backup to master transitions             :0
    Master to backup transitions             :0
root@vmx-01> show vrrp interface irb.20
Jun 08 13:07:36
Interface: irb.20, Interface index :333, Groups: 1, Active :1
  Interface VRRP PDU statistics
    Advertisement sent                       :173
    Advertisement received                   :0
    Packets received                         :0
    No group match received                  :0
  Interface VRRP PDU error statistics
    Invalid IPAH next type received          :0
    Invalid VRRP TTL value received          :0
    Invalid VRRP version received            :0
    Invalid VRRP PDU type received           :0
    Invalid VRRP authentication type received:0
    Invalid VRRP IP count received           :0
    Invalid VRRP checksum received           :0

Physical interface: irb, Unit: 20, Address: 10.0.200.2/24
  Index: 333, SNMP ifIndex: 556, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 2, State: master, VRRP Mode: Active
  Priority: 254, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.200.1
  Advertisement Timer: 0.735s, Master router: 10.0.200.2
  Virtual router uptime: 00:15:16, Master router uptime: 00:15:11
  Virtual Mac: 00:00:5e:00:01:02
  Preferred: yes
  Tracking: enabled
    Current priority: 254, Configured priority: 254
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Tracked interface: ae0.0
        Interface state: up Speed: 2g
        Incurred priority cost: 0
        Threshold   Priority cost   Active
        down                   10
    Route tracking: disabled
    BFD tracking: disabled
  Group VRRP PDU statistics
    Advertisement sent                       :173
    Advertisement received                   :0
  Group VRRP PDU error statistics
    Bad authentication Type received         :0
    Bad password received                    :0
    Bad MD5 digest received                  :0
    Bad advertisement timer received         :0
    Bad VIP count received                   :0
    Bad VIPADDR received                     :0
  Group state transition statistics
    Idle to master transitions               :0
    Idle to backup transitions               :0
    Backup to master transitions             :0
    Master to backup transitions             :0

vMX-02

root@vmx-02> clear vrrp all
Jun 08 13:05:02
vrrpd: cleared stats: irb.10
vrrpd: cleared stats: irb.20
root@vmx-02> show vrrp interface irb.10
Jun 08 13:07:57
Interface: irb.10, Interface index :332, Groups: 1, Active :1
  Interface VRRP PDU statistics
    Advertisement sent                       :198
    Advertisement received                   :0
    Packets received                         :0
    No group match received                  :0
  Interface VRRP PDU error statistics
    Invalid IPAH next type received          :0
    Invalid VRRP TTL value received          :0
    Invalid VRRP version received            :0
    Invalid VRRP PDU type received           :0
    Invalid VRRP authentication type received:0
    Invalid VRRP IP count received           :0
    Invalid VRRP checksum received           :0

Physical interface: irb, Unit: 10, Address: 10.0.100.3/24
  Index: 332, SNMP ifIndex: 548, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 253, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.100.1
  Advertisement Timer: 0.036s, Master router: 10.0.100.3
  Virtual router uptime: 00:15:37, Master router uptime: 00:15:32
  Virtual Mac: 00:00:5e:00:01:01
  Preferred: yes
  Tracking: enabled
    Current priority: 253, Configured priority: 253
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Tracked interface: ae0.0
        Interface state: up Speed: 2g
        Incurred priority cost: 0
        Threshold   Priority cost   Active
        down                   10
    Route tracking: disabled
    BFD tracking: disabled
  Group VRRP PDU statistics
    Advertisement sent                       :198
    Advertisement received                   :0
  Group VRRP PDU error statistics
    Bad authentication Type received         :0
    Bad password received                    :0
    Bad MD5 digest received                  :0
    Bad advertisement timer received         :0
    Bad VIP count received                   :0
    Bad VIPADDR received                     :0
  Group state transition statistics
    Idle to master transitions               :0
    Idle to backup transitions               :0
    Backup to master transitions             :0
    Master to backup transitions             :0
root@vmx-02> show vrrp interface irb.20
Jun 08 13:08:17
Interface: irb.20, Interface index :333, Groups: 1, Active :1
  Interface VRRP PDU statistics
    Advertisement sent                       :224
    Advertisement received                   :0
    Packets received                         :0
    No group match received                  :0
  Interface VRRP PDU error statistics
    Invalid IPAH next type received          :0
    Invalid VRRP TTL value received          :0
    Invalid VRRP version received            :0
    Invalid VRRP PDU type received           :0
    Invalid VRRP authentication type received:0
    Invalid VRRP IP count received           :0
    Invalid VRRP checksum received           :0

Physical interface: irb, Unit: 20, Address: 10.0.200.3/24
  Index: 333, SNMP ifIndex: 559, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 2, State: master, VRRP Mode: Active
  Priority: 253, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 1, VIP: 10.0.200.1
  Advertisement Timer: 0.894s, Master router: 10.0.200.3
  Virtual router uptime: 00:15:57, Master router uptime: 00:15:52
  Virtual Mac: 00:00:5e:00:01:02
  Preferred: yes
  Tracking: enabled
    Current priority: 253, Configured priority: 253
    Priority hold time: disabled
    Interface tracking: enabled, Interface count: 1
      Tracked interface: ae0.0
        Interface state: up Speed: 2g
        Incurred priority cost: 0
        Threshold   Priority cost   Active
        down                   10
    Route tracking: disabled
    BFD tracking: disabled
  Group VRRP PDU statistics
    Advertisement sent                       :224
    Advertisement received                   :0
  Group VRRP PDU error statistics
    Bad authentication Type received         :0
    Bad password received                    :0
    Bad MD5 digest received                  :0
    Bad advertisement timer received         :0
    Bad VIP count received                   :0
    Bad VIPADDR received                     :0
  Group state transition statistics
    Idle to master transitions               :0
    Idle to backup transitions               :0
    Backup to master transitions             :0
    Master to backup transitions             :0

On each interface we can see that advertisements are being sent from both sides, but advertisements are not being received. I linked a good troubleshooting blog post below that goes into more detail than I do that is worth checking out if you run into errors.

Conclusion

There are more concepts to explore within VRRP such as route tracking, preempt, adding new members to an existing VRRP group, and additional design considerations, but I may make that a part two. I hope someone finds this useful and if there’s anything you’d change, suggest, or just have a question about then please do leave a comment and I’ll try to respond.

Documentation Links and KBs

If I make a reference to anything or find something useful as I’m writing, I’ll link it below here:

https://datatracker.ietf.org/doc/html/rfc5798

https://www.juniper.net/documentation/us/en/software/junos/high-availability/topics/topic-map/vrrp-understanding.html#id-vrrp-failover-delay

https://www.juniper.net/documentation/us/en/software/junos/cli-reference/topics/ref/statement/failover-delay-qfx-series.html

https://supportportal.juniper.net/s/article/EX-Error-message-priority-cost-for-worst-case-must-not-exceed-configured-priority-for-vrrp-group

https://www.networkcuriosity.com/junos-vrrp-troubleshooting

Comments
Join the Discussion and Share Your Opinion
Add a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *