I am running a 4 node K8s cluster locally with vagrant on Centos 7.9 installed with Kubeadm. I’m using Kubernetes version: 1.19.3, Calico CNI: 3.17.1, and Containerd 1.4.3. I do not have Docker engine installed.
I am encountering an issue when using a kubernetes network ingress policy to allow requests between pods using a matching label. The firewall is configured using the information from: Installing kubeadm | Kubernetes and Calicos kubernetes requirements. I am using IP-in-IP encapsulation. I have also tried with vxlan encapsulation and have the same issue.
If the policy is not applied, inter-pod communication works fine with the firewall running. Once the policy is applied the inter-pod communication fails. If I stop the firewall on the host of the requesting pod, communication works. If I stop the firewall on the host of the target pod, communication still fails. I think the policy is correct. I’ve stopped the firewall and did negative testing and positive testing. I read the topic: [Solved] issue with kubernetes network policy and calico CNI - ingress namespaceSelector not working - #3 by Timo as it seemed similar and tried removing the masquerade setting but that made things worse. All inter-pod communication began failing.
The firewall rules on my worker nodes look like this:
[root@k8s-worker-2 ~]# firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: eth0 eth1
sources:
services: dhcpv6-client ssh
ports: 10250/tcp 30000-32767/tcp 30000-32767/udp 53/udp 53/tcp 179/tcp 179/udp 4789/udp 5473/tcp
protocols:
masquerade: yes
forward-ports:
source-ports:
icmp-blocks:
rich rules:
rule protocol value="4" accept
This is how I produce the issue. I start by creating a namespace: kubectl create ns np
Then apply the following DaemonSet. This runs an app on each node listening on port 8080 that returns the name of the pod and its version.
apiVersion: apps/v1
kind: DaemonSet
metadata:
creationTimestamp: null
labels:
app: deployment
name: foo
namespace: np
spec:
selector:
matchLabels:
app: deployment
template:
metadata:
creationTimestamp: null
labels:
app: deployment
spec:
containers:
- image: dgkanatsios/simpleapp
name: simpleapp
ports:
- containerPort: 8080
resources: {}
Then apply another DaemonSet that puts a netshoot pod on each node. This will be used for sending requests to the hostname pods.
apiVersion: apps/v1
kind: DaemonSet de
metadata:
name: netshoot-daemonset
namespace: np
labels:
app: netshoot
spec:
selector:
matchLabels:
name: netshoot-daemonset
template:
metadata:
labels:
name: netshoot-daemonset
spec:
containers:
- command: ["tail"]
args: ["-f", "/dev/null"]
image: nicolaka/netshoot
name: netshoot-pod
Then run kubectl -n np get po -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
foo-8s8mc 1/1 Running 0 9m1s 172.16.140.1 k8s-worker-2 <none> <none>
foo-m24vf 1/1 Running 0 9m1s 172.16.182.65 k8s-master-2 <none> <none>
foo-v62mm 1/1 Running 0 9m1s 172.16.230.2 k8s-worker-1 <none> <none>
netshoot-daemonset-c86d5 1/1 Running 0 8m55s 172.16.182.66 k8s-master-2 <none> <none>
netshoot-daemonset-ftnjx 1/1 Running 0 8m55s 172.16.140.2 k8s-worker-2 <none> <none>
netshoot-daemonset-qzcsn 1/1 Running 0 8m55s 172.16.230.3 k8s-worker-1 <none> <none>
To send a request from the netshoot pod on k8s-worker-1 to the foo pod running on k8s-worker-2, I use: k exec -it netshoot-daemonset-qzcsn -- wget -qO- 172.16.140.1:8080 --timeout=2
With the firewall started but no network policy I get this result:
Hello world from foo-ksjzp and version 2.0
I tested communication between netshoot pods running on each node to each foo pod in a similar fashion. No issues with any communication. So I apply the ingress policy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: ingress
namespace: np
spec:
podSelector:
matchLabels:
app: "deployment"
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
name: netshoot-daemonset
ports:
- protocol: TCP
port: 8080
When I try the netshoot command again, it times out. I tried different timeouts (up to 10 seconds).
[root@k8s-master-1 network-policy]# k exec -it netshoot-daemonset-qzcsn -- wget -qO- 172.16.140.1:8080 --timeout=2
wget: download timed out
command terminated with exit code 1
I then stopped the firewalld service on k8s-worker-1, where the netshoot-daemonset-pod is running. When I execute the command again, it succeeds. I then restarted firewalld on k8s-worker-1 and stopped it on k8s-worker-2, where the foo pod is running. When I execute the command again it succeeds. When I delete the network policy the requests work again with both firewalls started.
With the network policy removed and the firewalls running on both nodes I started investigating the traffic between the pods. On K8s-worker-2 I used crictl pods and crictl inspectp to find the calico interface the foo pod uses. I then used tcpdump on that interface and on the tunl0 interface. With the firewalls started and the policy removed, I see traffic on the tunl0 interface and into the pod’s interface. After applying the policy I see traffic on tunl0 but not on the pod’s interface. The traffic on tunl0 looks like this:
[root@k8s-worker-2 ~]# tcpdump -i tunl0 -vv -nn
tcpdump: listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
13:56:59.538718 IP (tos 0x0, ttl 63, id 5051, offset 0, flags [DF], proto TCP (6), length 60)
172.16.140.0.59224 > 172.16.230.9.8080: Flags [S], cksum 0xdef3 (correct), seq 11114561, win 28800, options [mss 1440,sackOK,TS val 44534778 ecr 0,nop,wscale 7], length 0
13:57:00.540011 IP (tos 0x0, ttl 63, id 5052, offset 0, flags [DF], proto TCP (6), length 60)
172.16.140.0.59224 > 172.16.230.9.8080: Flags [S], cksum 0xdb09 (correct), seq 11114561, win 28800, options [mss 1440,sackOK,TS val 44535780 ecr 0,nop,wscale 7], length 0
This is as far as I’ve gotten. I’m not proficient yet at debugging iptables and I’m hoping I have something just misconfigured. I do have another older 4 node cluster on Centos running Kubernetes 1.18.1, Calico CNI: 3.15.1, and Docker Engine 19.03. The communication works on that cluster with the firewalls running and policy applied. I’ve compared settings between my two clusters and I have not found a difference in the settings or firewall setup.
I have ran diagnostics using calicoctl, but I don’t know what I am looking for in the diagnostic files. I can make them available. Any advice, ideas, or support would be greatly appreciated. Thank you.