From 6b432e7dde828076ba1a6044d426ba93c8be229f Mon Sep 17 00:00:00 2001 From: Iryna Shustava Date: Fri, 22 Jan 2021 16:31:37 -0800 Subject: [PATCH] docs: Add k8s troubleshooting docs (hostPort vs hostNetwork) (#9464) --- .../docs/troubleshoot/common-errors.mdx | 51 +++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/website/content/docs/troubleshoot/common-errors.mdx b/website/content/docs/troubleshoot/common-errors.mdx index 48a8114ea..eefe36a9c 100644 --- a/website/content/docs/troubleshoot/common-errors.mdx +++ b/website/content/docs/troubleshoot/common-errors.mdx @@ -10,6 +10,8 @@ When installing and running Consul, there are some common messages you might see If you are getting an error message you don't see listed on this page, please consider following our general [Troubleshooting Guide][troubleshooting]. +For common errors messages related to Kubernetes, please go to [Common errors on Kubernetes](#common-errors-on-kubernetes). + ## Configuration file errors ### Multiple network interfaces @@ -147,6 +149,55 @@ You have installed an Enterprise version of Consul. If you are an Enterprise cus -> **Note:** Enterprise binaries can be identified on our [download site][releases] by the `+ent` suffix. +## Common errors on Kubernetes + +### Unable to connect to the Consul client on the same host + +If the pods are unable to connect to a Consul client running on the same host, +first check if the Consul clients are up and running with `kubectl get pods`. + +```shell-session +$ kubectl get pods -l "component=client" +NAME READY STATUS RESTARTS AGE +consul-kzws6 1/1 Running 0 58s +``` + +If you are still unable to connect +and see `i/o timeout` or `connection refused` errors when connecting to the Consul client on the Kubernetes worker, +this could be because the CNI (Container Networking Interface) +does not [support](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#support-hostport) +the use of `hostPort`. + +```text +Put http://10.0.0.10:8500/v1/catalog/register: dial tcp 10.0.0.10:8500: connect: connection refused +``` + +```text +Put http://10.0.0.10:8500/v1/agent/service/register: dial tcp 10.0.0.10:8500: connect: connection refused +``` + +```text +Get http://10.0.0.10:8500/v1/status/leader: dial tcp 10.0.0.10:8500: i/o timeout +``` + +The IP `10.0.0.10` above refers to the IP of the host where the Consul client pods are running. + +To work around this issue, +enable [`hostNetwork`](/docs/k8s/helm#v-client-hostnetwork) in your Helm values. +Using the host network will enable the pod to use the host's network namespace without +the need for CNI to support port mappings between containers and the host. + +```yaml +client: + hostNetwork: true + dnsPolicy: ClusterFirstWithHostNet +``` + +-> **Note:** Using host network has security implications +as doing so gives the Consul client unnecessary access to all network traffic on the host. +We recommend raising an issue with the CNI you're using to add support for `hostPort` +and switching back to `hostPort` eventually. + [troubleshooting]: https://learn.hashicorp.com/consul/day-2-operations/advanced-operations/troubleshooting [node_name]: /docs/agent/options#node_name [retry_join]: /docs/agent/options#retry-join