drumato.com

about
contacts
disclaimer
license
post
diary
Researching the Behavior of FRR BGPd Bestpath Selection Partially

Researching the Behavior of FRR BGPd Bestpath Selection Partially


If you feel hard to read the sentences as below,
I prefer you to translate the corresponding article to it.

We know one mechanism called the "bestpath selection" in BGP, that is a decision process what the BGP implementation select a route as forwarding rule actually.
In BGP we may encounter some cases that receives some routes for same prefix from different neighbors.
A BGP implementation decides one route from them and select it as the only forwarding rule to the prefix.

Maybe you know the multipath routing, that controls multiple routes to the same prefix and uses all of them as the forwarding rules.
this mechanism is related with the bestpath selection quitely but this posts won't describe it in detail.

For example, FRRouting exposes an algorithm called Route Selection .
Today I'll try to understand the algorithm by using Linux Networking.
The route selection algorithms consists of 13 roles but for now I'll check 1st, 2nd 3rd rule of all.

This post will be changed for covering all rules someday.

All tinet configurations in this post are placed in here.

I'll show you the runtime environment.

archey

Weight Check

The 1st rule is called Weight Check .
Before describing this rule I'll check the behavior if routes for the same prefix are advertised with different weight value.
Let's compare the difference between advertising with equal weights.

So I'll assume there is a network such as below.
the corresponding specification is here.
I'll configure the BGPd with redistribute connected in R2 and R3.
You can see the prefix-list in the specification so R2 and R3 only advertises routes for 10.0.0.0/16 and 10.0.1.0/24.

Someday I'll try the advertisement of default routes with neighbor PEER default-originate from UPPER.

1 +-----------------------------------+ 2 | | 3 | | 4 | R1 | 5 | AS65001 | 6 | | 7 | | 8 | .1 .2 | 9 +----------+-------------------+----+ 10 | 10.0.0.0/16 | 11 | | 12+---------------------+---+ +----------+---------------+ 13| .251 | | .252 | 14| R2 | | R3 | 15| AS65002 | | AS65003 | 16| | | | 17| | | | 18| .1 | | .2 | 19+-------------+-----------+ +---------+----------------+ 20 | 10.0.1.0/24 | 21 +--+--------------------------+------+ 22 | .251 .252 | 23 | | 24 | C1 | 25 | AS65004 | 26 | | 27 | | 28 | | 29 +------------------------------------+

Let's see the R1's configuration.

1$ docker exec R1 vtysh -c 'sh run' 2Building configuration... 3 4Current configuration: 5! 6frr version 8.0 7frr defaults traditional 8hostname R1 9log syslog informational 10no ipv6 forwarding 11service integrated-vtysh-config 12! 13interface net1 14 ip address 10.0.0.1/16 15! 16router bgp 65001 17 bgp router-id 1.1.1.1 18 bgp bestpath as-path multipath-relax 19 neighbor 10.0.0.251 remote-as 65002 20 neighbor 10.0.0.252 remote-as 65003 21 ! 22 address-family ipv4 unicast 23 neighbor 10.0.0.251 route-map RMAP_LOWER in 24 neighbor 10.0.0.252 route-map RMAP_LOWER in 25 exit-address-family 26! 27ip prefix-list PLIST_LOWER seq 5 permit 10.0.1.0/24 28! 29route-map RMAP_LOWER permit 10 30 match ip address prefix-list PLIST_LOWER 31! 32line vty 33! 34end

You can see one route-map RMAP_LOWER set the limitation of network prefix 10.0.1.0/24.
So R1 can receive only this prefix from R2 and R3.
this config works for reduction the entries of fib because R1 doesn't need 10.0.0.0/16 routes.

Next I'll check the C1's rib/fib.

1$ docker exec C1 vtysh -c 'sh bgp ipv4 unicast' 2BGP table version is 2, local router ID is 4.4.4.4, vrf id 0 3Default local pref 100, local AS 65004 4Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, 5 i internal, r RIB-failure, S Stale, R Removed 6Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self 7Origin codes: i - IGP, e - EGP, ? - incomplete 8RPKI validation codes: V valid, I invalid, N Not found 9 10 Network Next Hop Metric LocPrf Weight Path 11*= 10.0.0.0/16 10.0.1.2 0 0 65003 ? 12*> 10.0.1.1 0 0 65002 ? 13 14Displayed 1 routes and 2 total paths
1$ docker exec C1 vtysh -c 'sh ip route' 2Codes: K - kernel route, C - connected, S - static, R - RIP, 3 O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, 4 T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR, 5 f - OpenFabric, 6 > - selected route, * - FIB route, q - queued, r - rejected, b - backup 7 t - trapped, o - offload failure 8 9B>* 10.0.0.0/16 [20/0] via 10.0.1.1, net2, weight 1, 00:00:14 10 * via 10.0.1.2, net2, weight 1, 00:00:14 11C>* 10.0.1.0/24 is directly connected, net2, 00:00:16

The multipath routing seems worked correctly!
So now we move the main subject.
Let's set the weight value for each neighbor in C1.

1$ docker exec C1 vtysh -c 'sh run' 2Building configuration... 3 4Current configuration: 5! 6frr version 8.0 7frr defaults traditional 8hostname C1 9log syslog informational 10no ipv6 forwarding 11service integrated-vtysh-config 12! 13interface net2 14 ip address 10.0.1.254/24 15! 16router bgp 65004 17 bgp router-id 4.4.4.4 18 bgp bestpath as-path multipath-relax 19 neighbor 10.0.1.1 remote-as 65002 20 neighbor 10.0.1.2 remote-as 65003 21 ! 22 address-family ipv4 unicast 23 neighbor 10.0.1.1 route-map RMAP_UPPER1 in 24 neighbor 10.0.1.2 route-map RMAP_UPPER2 in 25 exit-address-family 26! 27ip prefix-list PLIST_UPPER seq 5 permit 10.0.0.0/16 28! 29route-map RMAP_UPPER1 permit 10 30 match ip address prefix-list PLIST_UPPER 31 set weight 10 32! 33route-map RMAP_UPPER2 permit 10 34 match ip address prefix-list PLIST_UPPER 35 set weight 20 36! 37line vty 38! 39end

How it works on C1's rib/fib?

1$ docker exec C1 vtysh -c 'sh bgp ipv4 unicast' 2BGP table version is 2, local router ID is 4.4.4.4, vrf id 0 3Default local pref 100, local AS 65004 4Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, 5 i internal, r RIB-failure, S Stale, R Removed 6Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self 7Origin codes: i - IGP, e - EGP, ? - incomplete 8RPKI validation codes: V valid, I invalid, N Not found 9 10 Network Next Hop Metric LocPrf Weight Path 11*> 10.0.0.0/16 10.0.1.2 0 20 65003 ? 12* 10.0.1.1 0 10 65002 ? 13 14Displayed 1 routes and 2 total paths
1$ docker exec C1 vtysh -c 'sh ip route' 2Codes: K - kernel route, C - connected, S - static, R - RIP, 3 O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, 4 T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR, 5 f - OpenFabric, 6 > - selected route, * - FIB route, q - queued, r - rejected, b - backup 7 t - trapped, o - offload failure 8 9B>* 10.0.0.0/16 [20/0] via 10.0.1.2, net2, weight 1, 00:01:10 10C>* 10.0.1.0/24 is directly connected, net2, 00:01:12

Yeah we can see the behavior of the neighbor weight!
if routes are advertised for the same prefix with different weight A BGP speaker prefer to select the route from the neighbor who has the highest weight.
Note that no difference are found excepting weight.

the weighting works at C1 so R1 still forwards traffics for 10.0.1.0/24 via R2 and R3.

Local Preference Check

Next I'll check the behavior of Local Preference Check.
It is related with the LOCAL_PREF path attribute that is used in iBGP.

I'll assume another network.
but there is a little bit difference btw this network and previous network.
the network uses the lo interface for peering iBGP speakers instead of veth.
we simplify the routes for lo's address are configured statically.
the corresponding tinet specification is here.

Let's see the route-maps of R2/R3.
The LOCAL_PREF attribute is exchanged in the entire of an AS so the local-preference value flows into C1 from R2.

1$ docker exec R2 vtysh -c 'sh route-map RMAP_UPPER' 2ZEBRA: 3route-map: RMAP_UPPER Invoked: 0 Optimization: enabled Processed Change: false 4 permit, sequence 10 Invoked 0 5 Match clauses: 6 ip address prefix-list PLIST_UPPER 7 Set clauses: 8 Call clause: 9 Action: 10 Exit routemap 11BGP: 12route-map: RMAP_UPPER Invoked: 12 Optimization: enabled Processed Change: false 13 permit, sequence 10 Invoked 6 14 Match clauses: 15 ip address prefix-list PLIST_UPPER 16 Set clauses: 17 local-preference 200 18 Call clause: 19 Action: 20 Exit routemap 21$ docker exec R3 vtysh -c 'sh route-map RMAP_UPPER' 22ZEBRA: 23route-map: RMAP_UPPER Invoked: 0 Optimization: enabled Processed Change: false 24 permit, sequence 10 Invoked 0 25 Match clauses: 26 ip address prefix-list PLIST_UPPER 27 Set clauses: 28 Call clause: 29 Action: 30 Exit routemap 31BGP: 32route-map: RMAP_UPPER Invoked: 13 Optimization: enabled Processed Change: false 33 permit, sequence 10 Invoked 7 34 Match clauses: 35 ip address prefix-list PLIST_UPPER 36 Set clauses: 37 local-preference 400 38 Call clause: 39 Action: 40 Exit routemap

I think the LOCAL_PREF attribute is updated to the configured value in this contextwhen the route of 10.0.0.0/16 is advertised.
So I think this config effects the bestpath selection in C1.

1$ docker exec C1 vtysh -c 'sh bgp ipv4 unicast' 2BGP table version is 1, local router ID is 4.4.4.4, vrf id 0 3Default local pref 100, local AS 65002 4Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, 5 i internal, r RIB-failure, S Stale, R Removed 6Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self 7Origin codes: i - IGP, e - EGP, ? - incomplete 8RPKI validation codes: V valid, I invalid, N Not found 9 10 Network Next Hop Metric LocPrf Weight Path 11*>i10.0.0.0/16 10.0.255.3 0 400 0 ? 12* i 10.0.255.2 0 200 0 ? 13 14Displayed 1 routes and 2 total paths
1$ docker exec C1 vtysh -c 'sh ip route' 2Codes: K - kernel route, C - connected, S - static, R - RIP, 3 O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, 4 T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR, 5 f - OpenFabric, 6 > - selected route, * - FIB route, q - queued, r - rejected, b - backup 7 t - trapped, o - offload failure 8 9B> 10.0.0.0/16 [200/0] via 10.0.255.3 (recursive), weight 1, 00:00:19 10 * via 10.0.1.2, net2, weight 1, 00:00:19 11C>* 10.0.1.0/24 is directly connected, net2, 00:00:21 12S>* 10.0.255.2/32 [1/0] via 10.0.1.1, net2, weight 1, 00:00:21 13S>* 10.0.255.3/32 [1/0] via 10.0.1.2, net2, weight 1, 00:00:21 14C>* 10.0.255.4/32 is directly connected, lo, 00:00:21

It makes sense!
I found a weakness that I don't know how the NEXT_HOP attribute works for recursive lookup.

Local Route Check

Finally I reached the rule, Local Route Check.
This rule prioritize static/aggregates/redistbuted routes.
It's so simple!
And I confirmed the rule already.
Let's check the rib of R2 with Local Preference Check specification.

1$ docker exec R2 vtysh -c 'sh bgp ipv4 unicast' 2BGP table version is 3, local router ID is 2.2.2.2, vrf id 0 3Default local pref 100, local AS 65002 4Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, 5 i internal, r RIB-failure, S Stale, R Removed 6Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self 7Origin codes: i - IGP, e - EGP, ? - incomplete 8RPKI validation codes: V valid, I invalid, N Not found 9 10 Network Next Hop Metric LocPrf Weight Path 11* i10.0.0.0/16 10.0.255.3 0 400 0 ? 12*> 0.0.0.0 0 32768 ? 13*> 10.0.1.0/24 0.0.0.0 0 32768 ? 14*> 10.0.255.2/32 0.0.0.0 0 32768 ? 15 16Displayed 3 routes and 4 total paths

here R2 has two routes.

And the selected route as bestpath is the letter.

If you want to use the received route as bestpath,
you should weight the route higher than 32768.

Conclusion

Today I briefly recapped the precedence of BGP bestpath selection.
But BGP bestpath selection uses more rules like below.

Someday I may research these rules for upderstanding more complicated routes.
I don't know all of BGP actually.