EXOS nodes not see echother in Lab and Sharing LACP LAG is flapped
Posted: Fri Oct 01, 2021 9:00 pm
Hellow Gues!
I have Community EVE-NG 2.0.3-112 on standalone ESXi 6.7 on ProLiant BL460c Gen8 with kvm-ok, 24 vCPUs, 64 Gb RAM, 150 Gb HDD on eve-ng VM in ESXi
And i have two node EXOS (with 12 ports) connected ethother trouth 10 and 11 ports like this:

Other nodes in a LAB stoped.
This ports 10-11 configured as shared with lacp for mlag isc. But ISC do not ping echother.
Altrough sharing lacp lag 10 flap states persistently on both nodes.
Also i get capture traffic from port 10 vCore-3 and see that LACP PDU arrives and transmits not constantly.
Look at my PCAP https://www.cloudshark.org/captures/fdc3a6935595
Also sometimes in both nodes alternately appeares below errors:
these errors appear randomly and do not depend on changing the EXOS configuration and show command outputs, because changes configuration does not made and error may appearse on node
around load on eve insignificant

But one of vCPU load on 100% with error on node appears:
in somebody have idea what do next for understanding that is a problem and how fixed him?
Regards, aleksandr_vd
I have Community EVE-NG 2.0.3-112 on standalone ESXi 6.7 on ProLiant BL460c Gen8 with kvm-ok, 24 vCPUs, 64 Gb RAM, 150 Gb HDD on eve-ng VM in ESXi
And i have two node EXOS (with 12 ports) connected ethother trouth 10 and 11 ports like this:

Other nodes in a LAB stoped.
This ports 10-11 configured as shared with lacp for mlag isc. But ISC do not ping echother.
Altrough sharing lacp lag 10 flap states persistently on both nodes.
Code: Select all
* vCore-4.54 # show lacp lag 10
Lag Actor Actor Partner Partner Partner Agg Actor
Sys-Pri Key MAC Sys-Pri Key Count MAC
--------------------------------------------------------------------------------
10 0 0x03f2 00:00:00:00:00:00 0 0x0000 0 50:00:00:04:00:00
Port list:
Member Port Rx Sel Mux Actor Partner
Port Priority State Logic State Flags Port
--------------------------------------------------------------------------------
10 0 Initialize Unselected Detached A-G----- 0
11 0 Initialize Unselected Detached A-G----- 0
================================================================================
Actor Flags: A-Activity, T-Timeout, G-Aggregation, S-Synchronization
C-Collecting, D-Distributing, F-Defaulted, E-Expired
* vCore-4.55 # show lacp lag 10
Lag Actor Actor Partner Partner Partner Agg Actor
Sys-Pri Key MAC Sys-Pri Key Count MAC
--------------------------------------------------------------------------------
10 0 0x03f2 50:00:00:03:00:00 0 0x03f2 0 50:00:00:04:00:00
Port list:
Member Port Rx Sel Mux Actor Partner
Port Priority State Logic State Flags Port
--------------------------------------------------------------------------------
10 0 Current Selected Waiting A-G----- 1011
11 0 Current Selected Waiting A-G----- 1010
================================================================================
Actor Flags: A-Activity, T-Timeout, G-Aggregation, S-Synchronization
C-Collecting, D-Distributing, F-Defaulted, E-Expired
* vCore-4.56 # show lacp lag 10
Lag Actor Actor Partner Partner Partner Agg Actor
Sys-Pri Key MAC Sys-Pri Key Count MAC
--------------------------------------------------------------------------------
10 0 0x03f2 50:00:00:03:00:00 0 0x03f2 0 50:00:00:04:00:00
Port list:
Member Port Rx Sel Mux Actor Partner
Port Priority State Logic State Flags Port
--------------------------------------------------------------------------------
10 0 Current Selected Attached A-GS---- 1011
11 0 Idle Unselected Detached -------- 0
================================================================================
Actor Flags: A-Activity, T-Timeout, G-Aggregation, S-Synchronization
C-Collecting, D-Distributing, F-Defaulted, E-Expired
* vCore-4.56 # show lacp lag 10
Lag Actor Actor Partner Partner Partner Agg Actor
Sys-Pri Key MAC Sys-Pri Key Count MAC
--------------------------------------------------------------------------------
10 0 0x03f2 50:00:00:03:00:00 0 0x03f2 0 50:00:00:04:00:00
Port list:
Member Port Rx Sel Mux Actor Partner
Port Priority State Logic State Flags Port
--------------------------------------------------------------------------------
10 0 Current Selected Attached A-GS---- 1011
11 0 Current Selected Attached A-GS---- 1010
================================================================================
Actor Flags: A-Activity, T-Timeout, G-Aggregation, S-Synchronization
C-Collecting, D-Distributing, F-Defaulted, E-Expired
* vCore-4.56 #
Look at my PCAP https://www.cloudshark.org/captures/fdc3a6935595
Also sometimes in both nodes alternately appeares below errors:
Code: Select all
[ 1218.768289] CPU 0: soft watchdog expiration warning at 0010:ffffffffc00e7b9b (getTxPifFromLif+0x2b/0x770 [exvlan]) for 785 seconds, process vsm (767)
[ 1223.968240] CPU 0: soft watchdog expiration warning at 0010:ffffffffc00e7b8f (getTxPifFromLif+0x1f/0x770 [exvlan]) for 791 seconds, process vsm (767)
[ 1228.968244] CPU 0: soft watchdog expiration warning at 0010:ffffffffc00f7d08 (getNextPifOnLif+0x18/0x50 [exvlan]) for 796 seconds, process vsm (767)
[ 1233.968246] CPU 0: soft watchdog expiration warning at 0010:ffffffffc00e7b94 (getTxPifFromLif+0x24/0x770 [exvlan]) for 801 seconds, process vsm (767)
[ 2425.912379] CPU 0: soft watchdog expiration warning at 0010:ffffffffc01dfd31 (getNextPifOnLif+0x41/0x50 [exvlan]) for 37 seconds, process swapper/0 (0)
[ 2431.112383] CPU 0: soft watchdog expiration warning at 0010:ffffffffc01dfcda (getFirstPifOnLif+0x1a/0x30 [exvlan]) for 42 seconds, process swapper/0 (0)
[ 2436.112638] CPU 0: soft watchdog expiration warning at 0010:ffffffffc01dfcf0 (getNextPifOnLif+0x0/0x50 [exvlan]) for 47 seconds, process swapper/0 (0)
[ 2441.312367] CPU 0: soft watchdog expiration warning at 0010:ffffffffc01cfb8f (getTxPifFromLif+0x1f/0x770 [exvlan]) for 53 seconds, process swapper/0 (0)
around load on eve insignificant

But one of vCPU load on 100% with error on node appears:
Code: Select all
root@eve-ng:/opt/unetlab/data/Logs# top
top - 23:00:07 up 7 days, 8:28, 2 users, load average: 1.12, 1.13, 1.10
Tasks: 401 total, 2 running, 233 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu8 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu9 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu10 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu11 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu12 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu13 : 0.0 us, 1.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu14 : 0.0 us, 0.7 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu15 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu16 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu17 : 0.7 us, 1.0 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu18 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu19 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu20 : 0.3 us, 0.7 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu21 : 3.4 us, 8.2 sy, 0.0 ni, 88.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu22 : 0.0 us, 0.7 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu23 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 65965820 total, 47423740 free, 1457700 used, 17084380 buff/cache
KiB Swap: 1097724 total, 1097724 free, 0 used. 63651316 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3621 root 20 0 2960272 470992 26156 S 99.7 0.7 39:11.66 qemu-system-x86
22524 root 20 0 2960272 477268 26048 S 13.5 0.7 12:26.20 qemu-system-x86
350 root 25 5 0 0 0 S 0.7 0.0 59:25.44 uksmd
32267 root 20 0 42356 3924 3076 R 0.7 0.0 0:00.65 top
11 root 20 0 0 0 0 I 0.3 0.0 6:58.29 rcu_sched
2781 mysql 20 0 2296812 86424 20376 S 0.3 0.1 19:09.24 mysqld
3641 root 20 0 0 0 0 I 0.3 0.0 2:50.23 kworker/11:1-ev
18189 root 20 0 0 0 0 I 0.3 0.0 1:37.29 kworker/20:0-ev
18823 root 20 0 0 0 0 I 0.3 0.0 0:10.73 kworker/15:1-ev
18893 root 20 0 0 0 0 I 0.3 0.0 1:31.29 kworker/21:0-ev
18994 root 20 0 0 0 0 I 0.3 0.0 1:05.22 kworker/16:2-ev
19381 root 20 0 0 0 0 I 0.3 0.0 1:42.03 kworker/22:2-ev
19673 root 20 0 0 0 0 I 0.3 0.0 2:48.97 kworker/8:2-eve
1 root 20 0 37820 5708 3964 S 0.0 0.0 0:06.00 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.24 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-kb
7 root 20 0 0 0 0 I 0.0 0.0 0:00.00 kworker/u48:0-s
9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
10 root 20 0 0 0 0 S 0.0 0.0 0:00.56 ksoftirqd/0
12 root rt 0 0 0 0 S 0.0 0.0 0:02.44 migration/0
13 root -51 0 0 0 0 S 0.0 0.0 0:00.00 idle_inject/0
15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1
17 root -51 0 0 0 0 S 0.0 0.0 0:00.00 idle_inject/1
root@eve-ng:/opt/unetlab/data/Logs#
Regards, aleksandr_vd