Calico baremetal kubespray tunl0 and node-proxy not connecting

myiremark · January 2, 2021, 8:58pm

Hello,

We are using kubespray and ansible to connect dedicated hardware to openstack virtual machines in OVH through a vrack (OVH tagged vlan). It is based on this repo, and I am more than happy to push up any changes and collaborative fixes:

For some reason, all of the virtual machine nodes will connect and BGP peer properly. However, on the dedicated machines, BIRD is failing to start and initialize a tunnel at 10.233.0.1 (node-proxy).

This is resulting in timeout errors connecting to the KDD datastore seen in the kubectl logs for the calico-node instance on only the dedicated machines.

I have excluded potential high yield issues including:

Interface selection - uniformly naming all the network interfaces and setting the IP_AUTODETECTION to interface=INTERFACE_NAME
permissions issues - setting the ansible run user as the same passwordless sudo non root user (ubuntu)
node connectivity/vlan - all nodes can communicate over the private ip interface via ping / nc 179 over the private network
calico startup
- node status is calico started, no peers on dedicated machines
- node cstatus is calco started, peers on vm machines
overlapping CIDRs
- private network is 192.168.0.0/16
- kube_service_addresses: 10.233.0.0/18
- kube_pods_subnet: 10.233.64.0/18

Baremetal machine IPs and routes:

IP address for eno1: 51.81.152.64
IP address for ens3: 192.168.3.202
IP address for docker0: 172.17.0.1
IP address for kube-ipvs0: 10.233.0.1
IP address for nodelocaldns: 169.254.25.10

default via 51.81.152.254 dev eno1 proto static
51.81.152.0/24 dev eno1 proto kernel scope link src 51.81.152.64
169.254.169.254 via 192.168.0.2 dev ens3
169.254.169.254 via 192.168.0.2 dev ens3 proto dhcp src 192.168.3.202 metric 1024
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.0.0/16 dev ens3 proto kernel scope link src 192.168.3.202

Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 51.81.152.254 0.0.0.0 UG 0 0 0 eno1
51.81.152.0 0.0.0.0 255.255.255.0 U 0 0 0 eno1
169.254.169.254 192.168.0.2 255.255.255.255 UGH 0 0 0 ens3
169.254.169.254 192.168.0.2 255.255.255.255 UGH 1024 0 0 ens3
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3

VM ips and routes

IP address for ens3: 192.168.1.167
IP address for ens4: 51.81.153.45
IP address for docker0: 172.17.0.1
IP address for kube-ipvs0: 10.233.0.1
IP address for tunl0: 10.233.82.0
IP address for nodelocaldns: 169.254.25.10

default via 51.81.153.1 dev ens4 proto dhcp src 51.81.153.45 metric 1024
10.233.74.0/24 via 51.81.153.131 dev tunl0 proto bird onlink
10.233.81.0/24 via 51.81.153.154 dev tunl0 proto bird onlink
blackhole 10.233.82.0/24 proto bird
10.233.82.1 dev cali189369951bf scope link
10.233.84.0/24 via 51.81.153.25 dev tunl0 proto bird onlink
51.81.153.1 dev ens4 proto dhcp scope link src 51.81.153.45 metric 1024
169.254.169.254 via 192.168.0.2 dev ens3 proto dhcp src 192.168.1.167 metric 2048
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.0.0/16 dev ens3 proto kernel scope link src 192.168.1.167

Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 51.81.153.1 0.0.0.0 UG 1024 0 0 ens4
10.233.74.0 51.81.153.131 255.255.255.0 UG 0 0 0 tunl0
10.233.81.0 51.81.153.154 255.255.255.0 UG 0 0 0 tunl0
10.233.82.0 0.0.0.0 255.255.255.0 U 0 0 0 *
10.233.82.1 0.0.0.0 255.255.255.255 UH 0 0 0 cali189369951bf
10.233.84.0 51.81.153.25 255.255.255.0 UG 0 0 0 tunl0
51.81.153.1 0.0.0.0 255.255.255.255 UH 1024 0 0 ens4
169.254.169.254 192.168.0.2 255.255.255.255 UGH 2048 0 0 ens3
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3

kubectl describe calico-node on baremetal machine:

Normal Started 18m kubelet Started container calico-node
Warning Unhealthy 18m kubelet Liveness probe failed: calico/node is not ready: Felix is not live: Get “http://localhost:9099/liveness”: dial tcp 127.0.0.1:9099: connect: connection refused
Warning Unhealthy 17m (x3 over 17m) kubelet Liveness probe failed: calico/node is not ready: bird/confd is not live: exit status 1
Warning Unhealthy 17m (x5 over 18m) kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Failed to stat() nodename file: stat /var/lib/calico/nodename: no such file or directory
Warning DNSConfigForming 3m7s (x63 over 18m) kubelet Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 213.186.33.99 8.8.4.4 8.8.8.8

Logs for failing calico-kube-controllers pod

Warning Unhealthy 26m (x3 over 27m) kubelet Readiness probe failed: Failed to read status file status.json: open status.json: no such file or directory

Logs for failing calico-node pod:

(it won’t let me post more than 2 links, so DATASTORE_URL = Get https ://10.233.0.1:443/api/v1/nodes/foo

2021-01-02 22:37:18.577 [INFO][8] startup/startup.go 376: Early log level set to info
2021-01-02 22:37:18.578 [INFO][8] startup/startup.go 392: Using NODENAME environment for node name
2021-01-02 22:37:18.578 [INFO][8] startup/startup.go 404: Determined node name: k8s-test-baremetal-worker-1
2021-01-02 22:37:18.582 [INFO][8] startup/startup.go 436: Checking datastore connection
2021-01-02 22:37:28.584 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get DATASTORE_URL: TLS handshake timeout
2021-01-02 22:37:39.586 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get DATASTORE_URL net/http: TLS handshake timeout
2021-01-02 22:37:50.588 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get DATASTORE_URL: net/http: TLS handshake timeout
2021-01-02 22:38:01.591 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get DATASTORE_URL: net/http: TLS handshake timeout
2021-01-02 22:38:12.593 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get DATASTORE_URL: net/http: TLS handshake timeout

I believe what is going wrong is that the IP tables rules for routing are not being generated in the baremetal machine, based on inspection between the baremetal and the VM nodes. I believe these get inserted by the bird or kube-proxy process after it starts, but I am not sure how/why/when they are failing.

Can anyone point me to any high yield possibilities?

lwr20 · January 4, 2021, 2:45pm

This is interesting. Calico node is trying to connect to the kube-apiserver service IP at 10.233.0.1:443 and failing the TLS handshake. That service IP is set up by kube-proxy and is probably a NAT rule that changes 10.233.0.1 to the IP of your master. Is your TLS certificate valid for 10.233.0.1? Can you manually curl to the apiserver from the failing node?

myiremark · January 7, 2021, 2:32am

Hi lwr20, thanks for the reply.

Good question. The TLS certificate is valid for both the public and private interface from both the good node and the failed node. However, on the failed node, I can’t curl to anything on 10.233.0.1 and on the good node I can use it normally. On the failed node, I get a request timeout before server hello but after client hello.

I believe you are correct. I dumped ip tables from the good and bad nodes and can provide. its not straightforward but those are not being generated for whatever reason.

myiremark · January 7, 2021, 6:21am

You raised an interesting question.

I changed the configuration for all the nodes to use the private interface IP and managed to get the tunnel and nodes peering, but after intialization they now won’t connect to the API server, indicating a bad certificate. It might be a one off bug. If it works I will definitely post the solution to github.

vanch · February 20, 2021, 11:36am

Hi, I have exactly the same issue as you trying to achieve probably same goals with same tools.

I was investigating calico options to achieve max performance and found that MTU for VM internal interfaces and Baremetal was different. 9000 for VMs and 1500 for BM. When I set MTU everywhere to same value, my worker node pods started successfully. I am not sure if that’s good to change 9000 to 1500, but it works at least. Though, there is even doc for configuration BMs using Jumbo frames, so there is probably no issue to set it to 9000 everywhere. Configuring Jumbo Frames in vRack | OVH Guides

GL.

Topic		Replies	Views
Newb-Q Can't ping pod network from master (IaaS k8s 1.20.4 in Azure on ubuntu 18.04 w/ calico) Open Source Calico Help	5	762	April 13, 2021
Calico-managed container communication accross hosts fails Open Source Calico Help kubespray	3	1188	December 16, 2020
The tunl0 interface is not created on k8s workers outside the cloud Cloud Help	3	1710	March 24, 2021
Pod-to-Pod communication not working in Calico Open Source Calico Help	1	663	May 25, 2022
Lost connecton from host to vms Open Source Calico Help	0	766	February 24, 2021