DNS not working inner pod when applying IPTABLES to Kubernetes with weave #3584
Description
What you expected to happen?
When applying my iptables policy on my hosted OS (both master & worker node), which is a kubernetes cluster with weave, it always ran successfully at the beginning of half of day, but then get "Name or service not known" inner pods without any exception seen in kube-dns, kube-proxy and weave
What happened?
After applying my iptables policy half of day, I can NOT even ping to www.google.com in my pod. However directly ping via IP always works.
How to reproduce it?
Since I'd like to isolate my production kubernetes cluster (without any cloud provider) from outside, I applied my iptalbes policy as below (especailly with a deny all
at the end of the policy)
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
# allow connection within the k8s cluster
-A INPUT -p all -s <all of master/worker nodes in cluster> -j ACCEPT
-A FORWARD -p all -s <all of master/worker nodes in cluster> -j ACCEPT
# allow all request from/to api server
-A INPUT -p all -s 10.96.0.1 -j ACCEPT
-A FORWARD -p all -s 10.96.0.1 -j ACCEPT
-A INPUT -p all -d 10.96.0.1 -j ACCEPT
-A FORWARD -p all -d 10.96.0.1 -j ACCEPT
# allow all request from/to inner pod
-A INPUT -p all -s 10.244.0.0/16 -j ACCEPT
-A FORWARD -p all -s 10.244.0.0/16 -j ACCEPT
-A INPUT -p all -d 10.244.0.0/16 -j ACCEPT
-A FORWARD -p all -d 10.244.0.0/16 -j ACCEPT
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
# !! deny all here, as I don't want any one access to my env
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT
the cluster works quite well at the beginning of about half of day, but then ,suddenly, I find
- In my pod, I can NO LONGER ping www.google.com (even ping hosted VMs hostname is not work), which return error "Name or service not known"
- In my pod, ping to google via google's IP always works .
- In my pod,
telnet 10.96.0.10 53
(kube-dns endpoint) still works
which means kube-dns should still work
- In my hosted VM, ping to any site always works
- can NOT find any error in the log kube-dns, kube-proxy and weave, and they never restart
Anything else we need to know?
- I find a workaround as : After deleting (and recreating) kube-dns and weave pods, the connection in pod can re-work, but several hours later the issue reproduces.
- I know kubernetes has Network-Policy to control the access, but can't find any sample to deny all (including core ports like 22 or 443), can you point my misunderstanding if I have?
- It's odd that sometimes ping www.google.com may resolve to the right IP, but still can NOT get the response.
Versions:
$ weave version
2.4.1
$ docker version
1.12.6
$ uname -a
centos 7.5
$ kubectl version
1.9.5
Activity