open-nomad/client
Tim Gross bb4880ec13
client: use RPC address and not serf after initial Consul discovery (#16217)
Nomad servers can advertise independent IP addresses for `serf` and
`rpc`. Somewhat unexpectedly, the `serf` address is also used for both Serf and
server-to-server RPC communication (including Raft RPC). The address advertised
for `rpc` is only used for client-to-server RPC. This split was introduced
intentionally in Nomad 0.8.

When clients are using Consul discovery for connecting to servers, they get an
initial discovery set from Consul and use the correct `rpc` tag in Consul to get
a list of adddresses for servers. The client then makes a `Status.Peers` RPC to
get the list of those servers that are raft peers. But this endpoint is shared
between servers and clients, and provides the address used for Raft.

Most of the time this is harmless because servers will bind on 0.0.0.0 anyways.,
But in topologies where servers are on a private network and clients are on
separate subnets (or even public subnets), clients will make initial contact
with the server to get the list of peers but then populate their local server
set with unreachable addresses.

Cluster administrators can work around this problem by using `server_join` with
specific IP addresses (or DNS names), because the `Node.UpdateStatus` endpoint
returns the correct set of RPC addresses when updating the node. So once a
client has registered, it will get the correct set of RPC addresses.

This changeset updates the client logic to query `Status.Members` instead of
`Status.Peers`, and then extract the correctly advertised address and port from
the response body.
2023-03-02 13:36:45 -05:00
..
allocdir
allochealth
allocrunner populate Nomad token for task runner update hooks (#16266) 2023-02-27 10:48:13 -05:00
allocwatcher
config artifact: protect against unbounded artifact decompression (1.5.0) (#16151) 2023-02-14 09:28:39 -06:00
consul
devicemanager
dynamicplugins
fingerprint cni: handle multi-path cni_path when fingerprinting plugins (#16163) 2023-02-13 14:55:56 -06:00
interfaces
lib cgutil: handle panic from runc helper method (#16180) 2023-02-14 15:09:43 -06:00
logmon
pluginmanager
servers
serviceregistration services: Set Nomad's User-Agent by default on HTTP checks for nomad services (#16248) 2023-02-23 08:10:42 -06:00
state Dynamic Node Metadata (#15844) 2023-02-07 14:42:25 -08:00
stats
structs
taskenv
testutil
vaultclient
acl.go Accept Workload Identities for Client RPCs (#16254) 2023-02-27 10:17:47 -08:00
acl_test.go Accept Workload Identities for Client RPCs (#16254) 2023-02-27 10:17:47 -08:00
agent_endpoint.go
agent_endpoint_test.go
alloc_endpoint.go Accept Workload Identities for Client RPCs (#16254) 2023-02-27 10:17:47 -08:00
alloc_endpoint_test.go
alloc_watcher_e2e_test.go
client.go client: use RPC address and not serf after initial Consul discovery (#16217) 2023-03-02 13:36:45 -05:00
client_stats_endpoint.go
client_stats_endpoint_test.go
client_test.go
csi_endpoint.go
csi_endpoint_test.go
driver_manager_test.go
enterprise_client_oss.go
fingerprint_manager.go
fingerprint_manager_test.go
fs_endpoint.go
fs_endpoint_test.go
gc.go
gc_test.go
heartbeatstop.go
heartbeatstop_test.go
meta_endpoint.go Dynamic Node Metadata (#15844) 2023-02-07 14:42:25 -08:00
meta_endpoint_test.go Dynamic Node Metadata (#15844) 2023-02-07 14:42:25 -08:00
node_updater.go
rpc.go Dynamic Node Metadata (#15844) 2023-02-07 14:42:25 -08:00
rpc_test.go
testing.go
util.go