My calico nodes are failing their ready check with the error
calico/node is not ready: BIRD is not ready: BGP not established with [list of all my ips]
The calico help page indicates that the easy thing to do here is
Check that BGP connectivity between the two peers is allowed in the environment.
I’m not clear how, exactly, I would confirm this. My environment is on an AWS snowball, so in principle it is similar to communicating between AWS instances, but everything is on a single piece of hardware.
All my nodes have the iptables rule -A INPUT -s [subnet] -i ens3 -p tcp -m tcp --dport 179 -j ACCEPT, which should be opening up port 179, which calico documentation says is the proper port to open. Testing with a python echo server seems to indicate that communication between these two nodes is working.
One wrinkle is that each node has two IPs, on two different subnets. I’m trying to establish connection through one subnet, but BIRD is defaulting to a list of IPs from the second subnet. I don’t know if there’s a way to force BIRD to use my first subnet - should setting IP_AUTODETECTION_METHOD force this?
tl;dr - how can I confirm that my nodes have BGP connectivity?
You can also check BIRD’s running state, as regards establishing those peerings:
kubectl exec <calico-node-pod-name> -n <calico-node-pod-namespace> birdcl -s /var/run/calico/bird.ctl show protocols all
(Those are a more under-the-hood version of calicoctl node status, and may provide a few more useful details.)
If it all looks correct - except for sessions not being Established - there must be something blocking the BGP traffic. You mentioned iptables INPUT chain, but depending on your tech and setup it could also be
iptables OUTPUT chain on the sending node
firewalld
nftables
AWS security groups, or something else that takes effect between the relevant nodes.
Your calico-nodes are in calico-system, which tells me that you’re using an operator install. You cannot update the calico-node daemonset directly in this setup, you’ll need to configure the operator to do it via the Installation CRD.
You want to set NodeAddressAutodetection
Once you update the installation resource, Operator will roll out the change to your calico-nodes (so no need to restart anything manually)