Kubernetes using KVM instances on OpenStack via KubeAdm
I have successfully deployed a "working" Kubernetes cluster using the Horizon interface to create the Linux instances:
Having configured the hosts according to: https://kubernetes.io/docs/setup/independent/high-availability/
I can now say I have a Kubernetes cluster:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-apiserver-1 Ready master 1d v1.12.2
kube-apiserver-2 Ready master 1d v1.12.2
kube-apiserver-3 Ready master 1d v1.12.2
kube-node-1 Ready <none> 21h v1.12.2
kube-node-2 Ready <none> 21h v1.12.2
kube-node-3 Ready <none> 21h v1.12.2
kube-node-4 Ready <none> 21h v1.12.2
However, getting beyond this point has proven to be quite a struggle. I can not create usable services and coredns which is an essential component seems unusable:
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-4gdnc 0/1 CrashLoopBackOff 288 23h
coredns-576cbf47c7-x4h4v 0/1 CrashLoopBackOff 288 23h
kube-apiserver-kube-apiserver-1 1/1 Running 0 1d
kube-apiserver-kube-apiserver-2 1/1 Running 0 1d
kube-apiserver-kube-apiserver-3 1/1 Running 0 1d
kube-controller-manager-kube-apiserver-1 1/1 Running 3 1d
kube-controller-manager-kube-apiserver-2 1/1 Running 1 1d
kube-controller-manager-kube-apiserver-3 1/1 Running 0 1d
kube-flannel-ds-amd64-2zdtd 1/1 Running 0 20h
kube-flannel-ds-amd64-7l5mr 1/1 Running 0 20h
kube-flannel-ds-amd64-bmvs9 1/1 Running 0 1d
kube-flannel-ds-amd64-cmhkg 1/1 Running 0 1d
...
Errors in the pod indicate that it cannot reach the kubernetes service:
$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
$ kubectl -n kube-system describe pod/coredns-576cbf47c7-dk7sh
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
Normal Pulling 25m kubelet, kube-node-3 pulling image "k8s.gcr.io/coredns:1.2.2"
Normal Pulled 25m kubelet, kube-node-3 Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
Normal Created 20m (x3 over 25m) kubelet, kube-node-3 Created container
Normal Killing 20m (x2 over 22m) kubelet, kube-node-3 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 20m (x2 over 22m) kubelet, kube-node-3 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Started 20m (x3 over 25m) kubelet, kube-node-3 Started container
Warning Unhealthy 4m (x36 over 24m) kubelet, kube-node-3 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 17s (x22 over 8m) kubelet, kube-node-3 Back-off restarting failed container
The kubernetes service is there and seems to be properly autoconfigured:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h
$ kubectl describe svc/kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity: None
Events: <none>
$ kubectl get endpoints
NAME ENDPOINTS AGE
kubernetes 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443 23h
I have a nagging suspicion that I am missing something in the network layer and that this issue has something to do with Neutron. There are plenty of HOWTOs on how to install Kubernetes using other tools and how to install it in OpenStack but I have yet to find one guide that explains how to install it by creating KVMs using the Horizon interface and dealing with security groups and network issues. By the way, ALL IPv4/TCP ports are open between the Masters and Nodes.
Is there anyone out there with a guide that explains this scenario?
kubernetes kubeadm openstack-neutron openstack-horizon
|
show 6 more comments
I have successfully deployed a "working" Kubernetes cluster using the Horizon interface to create the Linux instances:
Having configured the hosts according to: https://kubernetes.io/docs/setup/independent/high-availability/
I can now say I have a Kubernetes cluster:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-apiserver-1 Ready master 1d v1.12.2
kube-apiserver-2 Ready master 1d v1.12.2
kube-apiserver-3 Ready master 1d v1.12.2
kube-node-1 Ready <none> 21h v1.12.2
kube-node-2 Ready <none> 21h v1.12.2
kube-node-3 Ready <none> 21h v1.12.2
kube-node-4 Ready <none> 21h v1.12.2
However, getting beyond this point has proven to be quite a struggle. I can not create usable services and coredns which is an essential component seems unusable:
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-4gdnc 0/1 CrashLoopBackOff 288 23h
coredns-576cbf47c7-x4h4v 0/1 CrashLoopBackOff 288 23h
kube-apiserver-kube-apiserver-1 1/1 Running 0 1d
kube-apiserver-kube-apiserver-2 1/1 Running 0 1d
kube-apiserver-kube-apiserver-3 1/1 Running 0 1d
kube-controller-manager-kube-apiserver-1 1/1 Running 3 1d
kube-controller-manager-kube-apiserver-2 1/1 Running 1 1d
kube-controller-manager-kube-apiserver-3 1/1 Running 0 1d
kube-flannel-ds-amd64-2zdtd 1/1 Running 0 20h
kube-flannel-ds-amd64-7l5mr 1/1 Running 0 20h
kube-flannel-ds-amd64-bmvs9 1/1 Running 0 1d
kube-flannel-ds-amd64-cmhkg 1/1 Running 0 1d
...
Errors in the pod indicate that it cannot reach the kubernetes service:
$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
$ kubectl -n kube-system describe pod/coredns-576cbf47c7-dk7sh
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
Normal Pulling 25m kubelet, kube-node-3 pulling image "k8s.gcr.io/coredns:1.2.2"
Normal Pulled 25m kubelet, kube-node-3 Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
Normal Created 20m (x3 over 25m) kubelet, kube-node-3 Created container
Normal Killing 20m (x2 over 22m) kubelet, kube-node-3 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 20m (x2 over 22m) kubelet, kube-node-3 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Started 20m (x3 over 25m) kubelet, kube-node-3 Started container
Warning Unhealthy 4m (x36 over 24m) kubelet, kube-node-3 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 17s (x22 over 8m) kubelet, kube-node-3 Back-off restarting failed container
The kubernetes service is there and seems to be properly autoconfigured:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h
$ kubectl describe svc/kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity: None
Events: <none>
$ kubectl get endpoints
NAME ENDPOINTS AGE
kubernetes 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443 23h
I have a nagging suspicion that I am missing something in the network layer and that this issue has something to do with Neutron. There are plenty of HOWTOs on how to install Kubernetes using other tools and how to install it in OpenStack but I have yet to find one guide that explains how to install it by creating KVMs using the Horizon interface and dealing with security groups and network issues. By the way, ALL IPv4/TCP ports are open between the Masters and Nodes.
Is there anyone out there with a guide that explains this scenario?
kubernetes kubeadm openstack-neutron openstack-horizon
Can youcurl -k https://10.96.0.1:443
successfully from any of the nodes?
– Rico
Nov 21 '18 at 20:19
No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.
– Daniel Maldonado
Nov 21 '18 at 20:43
What's your CNI PodCidr?
– Rico
Nov 21 '18 at 20:58
Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"
– Daniel Maldonado
Nov 21 '18 at 21:09
ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their10.253.0.0
address from each other. (you should be able)
– Rico
Nov 21 '18 at 21:10
|
show 6 more comments
I have successfully deployed a "working" Kubernetes cluster using the Horizon interface to create the Linux instances:
Having configured the hosts according to: https://kubernetes.io/docs/setup/independent/high-availability/
I can now say I have a Kubernetes cluster:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-apiserver-1 Ready master 1d v1.12.2
kube-apiserver-2 Ready master 1d v1.12.2
kube-apiserver-3 Ready master 1d v1.12.2
kube-node-1 Ready <none> 21h v1.12.2
kube-node-2 Ready <none> 21h v1.12.2
kube-node-3 Ready <none> 21h v1.12.2
kube-node-4 Ready <none> 21h v1.12.2
However, getting beyond this point has proven to be quite a struggle. I can not create usable services and coredns which is an essential component seems unusable:
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-4gdnc 0/1 CrashLoopBackOff 288 23h
coredns-576cbf47c7-x4h4v 0/1 CrashLoopBackOff 288 23h
kube-apiserver-kube-apiserver-1 1/1 Running 0 1d
kube-apiserver-kube-apiserver-2 1/1 Running 0 1d
kube-apiserver-kube-apiserver-3 1/1 Running 0 1d
kube-controller-manager-kube-apiserver-1 1/1 Running 3 1d
kube-controller-manager-kube-apiserver-2 1/1 Running 1 1d
kube-controller-manager-kube-apiserver-3 1/1 Running 0 1d
kube-flannel-ds-amd64-2zdtd 1/1 Running 0 20h
kube-flannel-ds-amd64-7l5mr 1/1 Running 0 20h
kube-flannel-ds-amd64-bmvs9 1/1 Running 0 1d
kube-flannel-ds-amd64-cmhkg 1/1 Running 0 1d
...
Errors in the pod indicate that it cannot reach the kubernetes service:
$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
$ kubectl -n kube-system describe pod/coredns-576cbf47c7-dk7sh
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
Normal Pulling 25m kubelet, kube-node-3 pulling image "k8s.gcr.io/coredns:1.2.2"
Normal Pulled 25m kubelet, kube-node-3 Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
Normal Created 20m (x3 over 25m) kubelet, kube-node-3 Created container
Normal Killing 20m (x2 over 22m) kubelet, kube-node-3 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 20m (x2 over 22m) kubelet, kube-node-3 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Started 20m (x3 over 25m) kubelet, kube-node-3 Started container
Warning Unhealthy 4m (x36 over 24m) kubelet, kube-node-3 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 17s (x22 over 8m) kubelet, kube-node-3 Back-off restarting failed container
The kubernetes service is there and seems to be properly autoconfigured:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h
$ kubectl describe svc/kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity: None
Events: <none>
$ kubectl get endpoints
NAME ENDPOINTS AGE
kubernetes 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443 23h
I have a nagging suspicion that I am missing something in the network layer and that this issue has something to do with Neutron. There are plenty of HOWTOs on how to install Kubernetes using other tools and how to install it in OpenStack but I have yet to find one guide that explains how to install it by creating KVMs using the Horizon interface and dealing with security groups and network issues. By the way, ALL IPv4/TCP ports are open between the Masters and Nodes.
Is there anyone out there with a guide that explains this scenario?
kubernetes kubeadm openstack-neutron openstack-horizon
I have successfully deployed a "working" Kubernetes cluster using the Horizon interface to create the Linux instances:
Having configured the hosts according to: https://kubernetes.io/docs/setup/independent/high-availability/
I can now say I have a Kubernetes cluster:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-apiserver-1 Ready master 1d v1.12.2
kube-apiserver-2 Ready master 1d v1.12.2
kube-apiserver-3 Ready master 1d v1.12.2
kube-node-1 Ready <none> 21h v1.12.2
kube-node-2 Ready <none> 21h v1.12.2
kube-node-3 Ready <none> 21h v1.12.2
kube-node-4 Ready <none> 21h v1.12.2
However, getting beyond this point has proven to be quite a struggle. I can not create usable services and coredns which is an essential component seems unusable:
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-4gdnc 0/1 CrashLoopBackOff 288 23h
coredns-576cbf47c7-x4h4v 0/1 CrashLoopBackOff 288 23h
kube-apiserver-kube-apiserver-1 1/1 Running 0 1d
kube-apiserver-kube-apiserver-2 1/1 Running 0 1d
kube-apiserver-kube-apiserver-3 1/1 Running 0 1d
kube-controller-manager-kube-apiserver-1 1/1 Running 3 1d
kube-controller-manager-kube-apiserver-2 1/1 Running 1 1d
kube-controller-manager-kube-apiserver-3 1/1 Running 0 1d
kube-flannel-ds-amd64-2zdtd 1/1 Running 0 20h
kube-flannel-ds-amd64-7l5mr 1/1 Running 0 20h
kube-flannel-ds-amd64-bmvs9 1/1 Running 0 1d
kube-flannel-ds-amd64-cmhkg 1/1 Running 0 1d
...
Errors in the pod indicate that it cannot reach the kubernetes service:
$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
$ kubectl -n kube-system describe pod/coredns-576cbf47c7-dk7sh
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
Normal Pulling 25m kubelet, kube-node-3 pulling image "k8s.gcr.io/coredns:1.2.2"
Normal Pulled 25m kubelet, kube-node-3 Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
Normal Created 20m (x3 over 25m) kubelet, kube-node-3 Created container
Normal Killing 20m (x2 over 22m) kubelet, kube-node-3 Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 20m (x2 over 22m) kubelet, kube-node-3 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Started 20m (x3 over 25m) kubelet, kube-node-3 Started container
Warning Unhealthy 4m (x36 over 24m) kubelet, kube-node-3 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 17s (x22 over 8m) kubelet, kube-node-3 Back-off restarting failed container
The kubernetes service is there and seems to be properly autoconfigured:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h
$ kubectl describe svc/kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity: None
Events: <none>
$ kubectl get endpoints
NAME ENDPOINTS AGE
kubernetes 192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443 23h
I have a nagging suspicion that I am missing something in the network layer and that this issue has something to do with Neutron. There are plenty of HOWTOs on how to install Kubernetes using other tools and how to install it in OpenStack but I have yet to find one guide that explains how to install it by creating KVMs using the Horizon interface and dealing with security groups and network issues. By the way, ALL IPv4/TCP ports are open between the Masters and Nodes.
Is there anyone out there with a guide that explains this scenario?
kubernetes kubeadm openstack-neutron openstack-horizon
kubernetes kubeadm openstack-neutron openstack-horizon
edited Nov 21 '18 at 20:14
Rico
26.8k94865
26.8k94865
asked Nov 21 '18 at 19:37
Daniel MaldonadoDaniel Maldonado
12319
12319
Can youcurl -k https://10.96.0.1:443
successfully from any of the nodes?
– Rico
Nov 21 '18 at 20:19
No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.
– Daniel Maldonado
Nov 21 '18 at 20:43
What's your CNI PodCidr?
– Rico
Nov 21 '18 at 20:58
Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"
– Daniel Maldonado
Nov 21 '18 at 21:09
ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their10.253.0.0
address from each other. (you should be able)
– Rico
Nov 21 '18 at 21:10
|
show 6 more comments
Can youcurl -k https://10.96.0.1:443
successfully from any of the nodes?
– Rico
Nov 21 '18 at 20:19
No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.
– Daniel Maldonado
Nov 21 '18 at 20:43
What's your CNI PodCidr?
– Rico
Nov 21 '18 at 20:58
Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"
– Daniel Maldonado
Nov 21 '18 at 21:09
ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their10.253.0.0
address from each other. (you should be able)
– Rico
Nov 21 '18 at 21:10
Can you
curl -k https://10.96.0.1:443
successfully from any of the nodes?– Rico
Nov 21 '18 at 20:19
Can you
curl -k https://10.96.0.1:443
successfully from any of the nodes?– Rico
Nov 21 '18 at 20:19
No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.
– Daniel Maldonado
Nov 21 '18 at 20:43
No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.
– Daniel Maldonado
Nov 21 '18 at 20:43
What's your CNI PodCidr?
– Rico
Nov 21 '18 at 20:58
What's your CNI PodCidr?
– Rico
Nov 21 '18 at 20:58
Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"
– Daniel Maldonado
Nov 21 '18 at 21:09
Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"
– Daniel Maldonado
Nov 21 '18 at 21:09
ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their
10.253.0.0
address from each other. (you should be able)– Rico
Nov 21 '18 at 21:10
ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their
10.253.0.0
address from each other. (you should be able)– Rico
Nov 21 '18 at 21:10
|
show 6 more comments
1 Answer
1
active
oldest
votes
The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53419398%2fkubernetes-using-kvm-instances-on-openstack-via-kubeadm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.
add a comment |
The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.
add a comment |
The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.
The issue here was a polluted etcd cluster. As soon as I rebuilt the EXTERNAL etcd cluster and started from scratch using these instructions: https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd all items were working as expected. There does not seem to be a tool available to reset the etcd entries for a flannel pod network.
answered Nov 22 '18 at 22:56
Daniel MaldonadoDaniel Maldonado
12319
12319
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53419398%2fkubernetes-using-kvm-instances-on-openstack-via-kubeadm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Can you
curl -k https://10.96.0.1:443
successfully from any of the nodes?– Rico
Nov 21 '18 at 20:19
No, but remember, the node/host network is 192.168.5.x and does not know how to get to 10.96.0.x. The node network and the Kubernetes Pod network are different beasts.
– Daniel Maldonado
Nov 21 '18 at 20:43
What's your CNI PodCidr?
– Rico
Nov 21 '18 at 20:58
Flannel from: raw.githubusercontent.com/coreos/flannel/… The pod cidr is: "10.253.0.0/16"
– Daniel Maldonado
Nov 21 '18 at 21:09
ok, looks good. can you talk ping from one pod to another? create 2 dummy pods and ping their
10.253.0.0
address from each other. (you should be able)– Rico
Nov 21 '18 at 21:10