Kubernetes using KVM instances on OpenStack via KubeAdm












0















I have successfully deployed a "working" Kubernetes cluster using the Horizon interface to create the Linux instances:



enter image description here



Having configured the hosts according to: https://kubernetes.io/docs/setup/independent/high-availability/



I can now say I have a Kubernetes cluster:



$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-apiserver-1 Ready master 1d v1.12.2
kube-apiserver-2 Ready master 1d v1.12.2
kube-apiserver-3 Ready master 1d v1.12.2
kube-node-1 Ready <none> 21h v1.12.2
kube-node-2 Ready <none> 21h v1.12.2
kube-node-3 Ready <none> 21h v1.12.2
kube-node-4 Ready <none> 21h v1.12.2


However, getting beyond this point has proven to be quite a struggle. I can not create usable services and coredns which is an essential component seems unusable:



$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-4gdnc 0/1 CrashLoopBackOff 288 23h
coredns-576cbf47c7-x4h4v 0/1 CrashLoopBackOff 288 23h
kube-apiserver-kube-apiserver-1 1/1 Running 0 1d
kube-apiserver-kube-apiserver-2 1/1 Running 0 1d
kube-apiserver-kube-apiserver-3 1/1 Running 0 1d
kube-controller-manager-kube-apiserver-1 1/1 Running 3 1d
kube-controller-manager-kube-apiserver-2 1/1 Running 1 1d
kube-controller-manager-kube-apiserver-3 1/1 Running 0 1d
kube-flannel-ds-amd64-2zdtd 1/1 Running 0 20h
kube-flannel-ds-amd64-7l5mr 1/1 Running 0 20h
kube-flannel-ds-amd64-bmvs9 1/1 Running 0 1d
kube-flannel-ds-amd64-cmhkg 1/1 Running 0 1d
...


Errors in the pod indicate that it cannot reach the kubernetes service:



$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout


$ kubectl -n kube-system describe pod/coredns-576cbf47c7-dk7sh



...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
Normal Pulling 25m kubelet, kube-node-3 pulling image "k8s.gcr.io/coredns:1.2.2"
Normal Pulled 25m kubelet, kube-node-3 Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
Normal Created 20m (x3 over 25m) kubelet, kube-node-3 Created container
Normal Killing 20m (x2 over 22m) kubelet, kube-node-3 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 20m (x2 over 22m) kubelet, kube-node-3 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Started 20m (x3 over 25m) kubelet, kube-node-3 Started container
Warning Unhealthy 4m (x36 over 24m) kubelet, kube-node-3 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 17s (x22 over 8m) kubelet, kube-node-3 Back-off restarting failed container


The kubernetes service is there and seems to be properly autoconfigured:



$ kubectl get svc



NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h


$ kubectl describe svc/kubernetes



Name:              kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity: None
Events: <none>


$ kubectl get endpoints



NAME         ENDPOINTS                                               AGE
kubernetes 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443 23h


I have a nagging suspicion that I am missing something in the network layer and that this issue has something to do with Neutron. There are plenty of HOWTOs on how to install Kubernetes using other tools and how to install it in OpenStack but I have yet to find one guide that explains how to install it by creating KVMs using the Horizon interface and dealing with security groups and network issues. By the way, ALL IPv4/TCP ports are open between the Masters and Nodes.



Is there anyone out there with a guide that explains this scenario?










share|improve this question

























  • Can you curl -k https://10.96.0.1:443 successfully from any of the nodes?

    – Rico
    Nov 21 '18 at 20:19











  • No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.

    – Daniel Maldonado
    Nov 21 '18 at 20:43











  • What's your CNI PodCidr?

    – Rico
    Nov 21 '18 at 20:58











  • Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"

    – Daniel Maldonado
    Nov 21 '18 at 21:09











  • ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their 10.253.0.0 address from each other. (you should be able)

    – Rico
    Nov 21 '18 at 21:10
















0















I have successfully deployed a "working" Kubernetes cluster using the Horizon interface to create the Linux instances:



enter image description here



Having configured the hosts according to: https://kubernetes.io/docs/setup/independent/high-availability/



I can now say I have a Kubernetes cluster:



$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-apiserver-1 Ready master 1d v1.12.2
kube-apiserver-2 Ready master 1d v1.12.2
kube-apiserver-3 Ready master 1d v1.12.2
kube-node-1 Ready <none> 21h v1.12.2
kube-node-2 Ready <none> 21h v1.12.2
kube-node-3 Ready <none> 21h v1.12.2
kube-node-4 Ready <none> 21h v1.12.2


However, getting beyond this point has proven to be quite a struggle. I can not create usable services and coredns which is an essential component seems unusable:



$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-4gdnc 0/1 CrashLoopBackOff 288 23h
coredns-576cbf47c7-x4h4v 0/1 CrashLoopBackOff 288 23h
kube-apiserver-kube-apiserver-1 1/1 Running 0 1d
kube-apiserver-kube-apiserver-2 1/1 Running 0 1d
kube-apiserver-kube-apiserver-3 1/1 Running 0 1d
kube-controller-manager-kube-apiserver-1 1/1 Running 3 1d
kube-controller-manager-kube-apiserver-2 1/1 Running 1 1d
kube-controller-manager-kube-apiserver-3 1/1 Running 0 1d
kube-flannel-ds-amd64-2zdtd 1/1 Running 0 20h
kube-flannel-ds-amd64-7l5mr 1/1 Running 0 20h
kube-flannel-ds-amd64-bmvs9 1/1 Running 0 1d
kube-flannel-ds-amd64-cmhkg 1/1 Running 0 1d
...


Errors in the pod indicate that it cannot reach the kubernetes service:



$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout


$ kubectl -n kube-system describe pod/coredns-576cbf47c7-dk7sh



...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
Normal Pulling 25m kubelet, kube-node-3 pulling image "k8s.gcr.io/coredns:1.2.2"
Normal Pulled 25m kubelet, kube-node-3 Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
Normal Created 20m (x3 over 25m) kubelet, kube-node-3 Created container
Normal Killing 20m (x2 over 22m) kubelet, kube-node-3 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 20m (x2 over 22m) kubelet, kube-node-3 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Started 20m (x3 over 25m) kubelet, kube-node-3 Started container
Warning Unhealthy 4m (x36 over 24m) kubelet, kube-node-3 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 17s (x22 over 8m) kubelet, kube-node-3 Back-off restarting failed container


The kubernetes service is there and seems to be properly autoconfigured:



$ kubectl get svc



NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h


$ kubectl describe svc/kubernetes



Name:              kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity: None
Events: <none>


$ kubectl get endpoints



NAME         ENDPOINTS                                               AGE
kubernetes 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443 23h


I have a nagging suspicion that I am missing something in the network layer and that this issue has something to do with Neutron. There are plenty of HOWTOs on how to install Kubernetes using other tools and how to install it in OpenStack but I have yet to find one guide that explains how to install it by creating KVMs using the Horizon interface and dealing with security groups and network issues. By the way, ALL IPv4/TCP ports are open between the Masters and Nodes.



Is there anyone out there with a guide that explains this scenario?










share|improve this question

























  • Can you curl -k https://10.96.0.1:443 successfully from any of the nodes?

    – Rico
    Nov 21 '18 at 20:19











  • No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.

    – Daniel Maldonado
    Nov 21 '18 at 20:43











  • What's your CNI PodCidr?

    – Rico
    Nov 21 '18 at 20:58











  • Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"

    – Daniel Maldonado
    Nov 21 '18 at 21:09











  • ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their 10.253.0.0 address from each other. (you should be able)

    – Rico
    Nov 21 '18 at 21:10














0












0








0








I have successfully deployed a "working" Kubernetes cluster using the Horizon interface to create the Linux instances:



enter image description here



Having configured the hosts according to: https://kubernetes.io/docs/setup/independent/high-availability/



I can now say I have a Kubernetes cluster:



$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-apiserver-1 Ready master 1d v1.12.2
kube-apiserver-2 Ready master 1d v1.12.2
kube-apiserver-3 Ready master 1d v1.12.2
kube-node-1 Ready <none> 21h v1.12.2
kube-node-2 Ready <none> 21h v1.12.2
kube-node-3 Ready <none> 21h v1.12.2
kube-node-4 Ready <none> 21h v1.12.2


However, getting beyond this point has proven to be quite a struggle. I can not create usable services and coredns which is an essential component seems unusable:



$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-4gdnc 0/1 CrashLoopBackOff 288 23h
coredns-576cbf47c7-x4h4v 0/1 CrashLoopBackOff 288 23h
kube-apiserver-kube-apiserver-1 1/1 Running 0 1d
kube-apiserver-kube-apiserver-2 1/1 Running 0 1d
kube-apiserver-kube-apiserver-3 1/1 Running 0 1d
kube-controller-manager-kube-apiserver-1 1/1 Running 3 1d
kube-controller-manager-kube-apiserver-2 1/1 Running 1 1d
kube-controller-manager-kube-apiserver-3 1/1 Running 0 1d
kube-flannel-ds-amd64-2zdtd 1/1 Running 0 20h
kube-flannel-ds-amd64-7l5mr 1/1 Running 0 20h
kube-flannel-ds-amd64-bmvs9 1/1 Running 0 1d
kube-flannel-ds-amd64-cmhkg 1/1 Running 0 1d
...


Errors in the pod indicate that it cannot reach the kubernetes service:



$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout


$ kubectl -n kube-system describe pod/coredns-576cbf47c7-dk7sh



...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
Normal Pulling 25m kubelet, kube-node-3 pulling image "k8s.gcr.io/coredns:1.2.2"
Normal Pulled 25m kubelet, kube-node-3 Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
Normal Created 20m (x3 over 25m) kubelet, kube-node-3 Created container
Normal Killing 20m (x2 over 22m) kubelet, kube-node-3 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 20m (x2 over 22m) kubelet, kube-node-3 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Started 20m (x3 over 25m) kubelet, kube-node-3 Started container
Warning Unhealthy 4m (x36 over 24m) kubelet, kube-node-3 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 17s (x22 over 8m) kubelet, kube-node-3 Back-off restarting failed container


The kubernetes service is there and seems to be properly autoconfigured:



$ kubectl get svc



NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h


$ kubectl describe svc/kubernetes



Name:              kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity: None
Events: <none>


$ kubectl get endpoints



NAME         ENDPOINTS                                               AGE
kubernetes 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443 23h


I have a nagging suspicion that I am missing something in the network layer and that this issue has something to do with Neutron. There are plenty of HOWTOs on how to install Kubernetes using other tools and how to install it in OpenStack but I have yet to find one guide that explains how to install it by creating KVMs using the Horizon interface and dealing with security groups and network issues. By the way, ALL IPv4/TCP ports are open between the Masters and Nodes.



Is there anyone out there with a guide that explains this scenario?










share|improve this question
















I have successfully deployed a "working" Kubernetes cluster using the Horizon interface to create the Linux instances:



enter image description here



Having configured the hosts according to: https://kubernetes.io/docs/setup/independent/high-availability/



I can now say I have a Kubernetes cluster:



$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-apiserver-1 Ready master 1d v1.12.2
kube-apiserver-2 Ready master 1d v1.12.2
kube-apiserver-3 Ready master 1d v1.12.2
kube-node-1 Ready <none> 21h v1.12.2
kube-node-2 Ready <none> 21h v1.12.2
kube-node-3 Ready <none> 21h v1.12.2
kube-node-4 Ready <none> 21h v1.12.2


However, getting beyond this point has proven to be quite a struggle. I can not create usable services and coredns which is an essential component seems unusable:



$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-4gdnc 0/1 CrashLoopBackOff 288 23h
coredns-576cbf47c7-x4h4v 0/1 CrashLoopBackOff 288 23h
kube-apiserver-kube-apiserver-1 1/1 Running 0 1d
kube-apiserver-kube-apiserver-2 1/1 Running 0 1d
kube-apiserver-kube-apiserver-3 1/1 Running 0 1d
kube-controller-manager-kube-apiserver-1 1/1 Running 3 1d
kube-controller-manager-kube-apiserver-2 1/1 Running 1 1d
kube-controller-manager-kube-apiserver-3 1/1 Running 0 1d
kube-flannel-ds-amd64-2zdtd 1/1 Running 0 20h
kube-flannel-ds-amd64-7l5mr 1/1 Running 0 20h
kube-flannel-ds-amd64-bmvs9 1/1 Running 0 1d
kube-flannel-ds-amd64-cmhkg 1/1 Running 0 1d
...


Errors in the pod indicate that it cannot reach the kubernetes service:



$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout


$ kubectl -n kube-system describe pod/coredns-576cbf47c7-dk7sh



...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
Normal Pulling 25m kubelet, kube-node-3 pulling image "k8s.gcr.io/coredns:1.2.2"
Normal Pulled 25m kubelet, kube-node-3 Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
Normal Created 20m (x3 over 25m) kubelet, kube-node-3 Created container
Normal Killing 20m (x2 over 22m) kubelet, kube-node-3 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 20m (x2 over 22m) kubelet, kube-node-3 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Started 20m (x3 over 25m) kubelet, kube-node-3 Started container
Warning Unhealthy 4m (x36 over 24m) kubelet, kube-node-3 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 17s (x22 over 8m) kubelet, kube-node-3 Back-off restarting failed container


The kubernetes service is there and seems to be properly autoconfigured:



$ kubectl get svc



NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h


$ kubectl describe svc/kubernetes



Name:              kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity: None
Events: <none>


$ kubectl get endpoints



NAME         ENDPOINTS                                               AGE
kubernetes 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443 23h


I have a nagging suspicion that I am missing something in the network layer and that this issue has something to do with Neutron. There are plenty of HOWTOs on how to install Kubernetes using other tools and how to install it in OpenStack but I have yet to find one guide that explains how to install it by creating KVMs using the Horizon interface and dealing with security groups and network issues. By the way, ALL IPv4/TCP ports are open between the Masters and Nodes.



Is there anyone out there with a guide that explains this scenario?







kubernetes kubeadm openstack-neutron openstack-horizon






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 21 '18 at 20:14









Rico

26.8k94865




26.8k94865










asked Nov 21 '18 at 19:37









Daniel MaldonadoDaniel Maldonado

12319




12319













  • Can you curl -k https://10.96.0.1:443 successfully from any of the nodes?

    – Rico
    Nov 21 '18 at 20:19











  • No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.

    – Daniel Maldonado
    Nov 21 '18 at 20:43











  • What's your CNI PodCidr?

    – Rico
    Nov 21 '18 at 20:58











  • Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"

    – Daniel Maldonado
    Nov 21 '18 at 21:09











  • ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their 10.253.0.0 address from each other. (you should be able)

    – Rico
    Nov 21 '18 at 21:10



















  • Can you curl -k https://10.96.0.1:443 successfully from any of the nodes?

    – Rico
    Nov 21 '18 at 20:19











  • No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.

    – Daniel Maldonado
    Nov 21 '18 at 20:43











  • What's your CNI PodCidr?

    – Rico
    Nov 21 '18 at 20:58











  • Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"

    – Daniel Maldonado
    Nov 21 '18 at 21:09











  • ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their 10.253.0.0 address from each other. (you should be able)

    – Rico
    Nov 21 '18 at 21:10

















Can you curl -k https://10.96.0.1:443 successfully from any of the nodes?

– Rico
Nov 21 '18 at 20:19





Can you curl -k https://10.96.0.1:443 successfully from any of the nodes?

– Rico
Nov 21 '18 at 20:19













No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.

– Daniel Maldonado
Nov 21 '18 at 20:43





No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.

– Daniel Maldonado
Nov 21 '18 at 20:43













What's your CNI PodCidr?

– Rico
Nov 21 '18 at 20:58





What's your CNI PodCidr?

– Rico
Nov 21 '18 at 20:58













Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"

– Daniel Maldonado
Nov 21 '18 at 21:09





Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"

– Daniel Maldonado
Nov 21 '18 at 21:09













ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their 10.253.0.0 address from each other. (you should be able)

– Rico
Nov 21 '18 at 21:10





ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their 10.253.0.0 address from each other. (you should be able)

– Rico
Nov 21 '18 at 21:10












1 Answer
1






active

oldest

votes


















0














The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53419398%2fkubernetes-using-kvm-instances-on-openstack-via-kubeadm%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.






    share|improve this answer




























      0














      The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.






      share|improve this answer


























        0












        0








        0







        The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.






        share|improve this answer













        The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 22 '18 at 22:56









        Daniel MaldonadoDaniel Maldonado

        12319




        12319






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53419398%2fkubernetes-using-kvm-instances-on-openstack-via-kubeadm%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Wiesbaden

            Marschland

            Dieringhausen