Strange IOL behavior
Posted: Thu Jan 03, 2019 12:00 am
Hi,
Could someone help me to understand the below eve/iol behabior please ?
I'm using latest eve community, Bare metal install / 4 Cores / 8 T @3.9 GHz / 16GB
All iol nodes have 1024 nvram / 1024 RAM.
All nodes are already present in the map.
All run L3 images.
I start 20 nodes of IOL ***version A*** , but I can't start a 21th node with IOS ***version B*** (the node stops after 3/5 seconds).
In the wrapper log, there are those messages:
2/0 23:6:14.466 INF Tennant_id = 0
2/0 23:6:14.466 INF Device_id = 4
2/0 23:6:14.466 INF NETMAP file created.
2/0 23:6:14.466 INF TS configured.
2/0 23:6:14.466 INF TAP interface configured (s=8, n=vunl0_4_0).
2/0 23:6:14.466 INF TAP interface configured (s=10, n=vunl0_4_16).
2/0 23:6:14.466 INF TAP interface configured (s=12, n=vunl0_4_32).
2/0 23:6:14.466 INF TAP interface configured (s=14, n=vunl0_4_48).
2/0 23:6:14.466 INF Adding subprocess stdout descriptor (5).
2/0 23:6:14.466 INF Adding telnet socket descriptor (7).
2/0 23:6:14.466 INF Adding TAP interface descriptor (8).
2/0 23:6:14.466 INF Adding TAP interface descriptor (10).
2/0 23:6:14.466 INF Adding TAP interface descriptor (12).
2/0 23:6:14.466 INF Adding TAP interface descriptor (14).
2/0 23:6:17.466 INF Local (6) and remote (3) AF_UNIX are configured.
2/0 23:6:17.466 INF Adding wrapper socket descriptor (6).
2/0 23:6:17.466 INF Child is no more running.
i tried to wipe and restart the node, but with no success.
However I can start 5 additional nodes of IOS ***version C*** - so in total, I have 25 nodes started -.
Now, if I stop all the 25 nodes running, I'm able to start 25 nodes of version B . But the 21th node (in version B) that didn't start initially still refuses to start.
Now, if I stop all the nodes, this 21th node starts. And I can also start the 20 nodes in version A.
Any idea why ?
Even more strange (to me
) :
If I have my 25 nodes started (20 in version A + 5 in version C), I can ***create*** 25 additional nodes with version B , and they all start !
and the cherry on the cake:
IOL nodes: 50
CPU used : 9
Memory : 34
Swap: 1
Disk : 71
Is there a way to troubleshoot a bit more in depth what is happening (files to check, etc ...).
Could someone help me to understand the below eve/iol behabior please ?
I'm using latest eve community, Bare metal install / 4 Cores / 8 T @3.9 GHz / 16GB
All iol nodes have 1024 nvram / 1024 RAM.
All nodes are already present in the map.
All run L3 images.
I start 20 nodes of IOL ***version A*** , but I can't start a 21th node with IOS ***version B*** (the node stops after 3/5 seconds).
In the wrapper log, there are those messages:
2/0 23:6:14.466 INF Tennant_id = 0
2/0 23:6:14.466 INF Device_id = 4
2/0 23:6:14.466 INF NETMAP file created.
2/0 23:6:14.466 INF TS configured.
2/0 23:6:14.466 INF TAP interface configured (s=8, n=vunl0_4_0).
2/0 23:6:14.466 INF TAP interface configured (s=10, n=vunl0_4_16).
2/0 23:6:14.466 INF TAP interface configured (s=12, n=vunl0_4_32).
2/0 23:6:14.466 INF TAP interface configured (s=14, n=vunl0_4_48).
2/0 23:6:14.466 INF Adding subprocess stdout descriptor (5).
2/0 23:6:14.466 INF Adding telnet socket descriptor (7).
2/0 23:6:14.466 INF Adding TAP interface descriptor (8).
2/0 23:6:14.466 INF Adding TAP interface descriptor (10).
2/0 23:6:14.466 INF Adding TAP interface descriptor (12).
2/0 23:6:14.466 INF Adding TAP interface descriptor (14).
2/0 23:6:17.466 INF Local (6) and remote (3) AF_UNIX are configured.
2/0 23:6:17.466 INF Adding wrapper socket descriptor (6).
2/0 23:6:17.466 INF Child is no more running.
i tried to wipe and restart the node, but with no success.
However I can start 5 additional nodes of IOS ***version C*** - so in total, I have 25 nodes started -.
Now, if I stop all the 25 nodes running, I'm able to start 25 nodes of version B . But the 21th node (in version B) that didn't start initially still refuses to start.
Now, if I stop all the nodes, this 21th node starts. And I can also start the 20 nodes in version A.
Any idea why ?
Even more strange (to me

If I have my 25 nodes started (20 in version A + 5 in version C), I can ***create*** 25 additional nodes with version B , and they all start !
and the cherry on the cake:
IOL nodes: 50
CPU used : 9
Memory : 34
Swap: 1
Disk : 71
Is there a way to troubleshoot a bit more in depth what is happening (files to check, etc ...).