open-nomad

Commit Graph

Author	SHA1	Message	Date
Kris Hicks	0a3a748053	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
Benjamin Buzbee	e0acbbfcc6	Fix RPC retry logic in nomad client's rpc.go for blocking queries (#9266 )	2020-11-30 15:11:10 -05:00
Pierre Cauchois	13218dc345	Enforce bounds on MaxQueryTime (#9064 ) The MaxQueryTime value used in QueryOptions.HasTimedOut() can be set to an invalid value that would throw off how RPC requests are retried. This fix uses the same logic that enforces the MaxQueryTime bounds in the blockingRPC() call.	2020-10-15 08:43:06 -04:00
Mahmood Ali	e37a3312d5	If leadership fails, consider it handled The callers for `forward` and old implementation expect failures to be accompanied with a true value! This fixes the issue and have tests passing!	2020-05-31 22:06:17 -04:00
Mahmood Ali	30ab9c84e5	more review feedback	2020-05-31 21:39:09 -04:00
Mahmood Ali	2108681c1d	Endpoint for snapshotting server state	2020-05-21 20:04:38 -04:00
Yoan Blanc	225c9c1215	fixup! vendor: explicit use of hashicorp/go-msgpack Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-31 09:48:07 -04:00
Yoan Blanc	761d014071	vendor: explicit use of hashicorp/go-msgpack Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-31 09:45:21 -04:00
Lang Martin	334979a754	nomad/rpc: indicate missing region in error message	2020-03-23 13:58:29 -04:00
Mahmood Ali	e106d373b2	rpc: Use MultiplexV2 for connections MultiplexV2 is a new connection multiplex header that supports multiplex both RPC and streaming requests over the same Yamux connection. MultiplexV2 was added in 0.8.0 as part of https://github.com/hashicorp/nomad/pull/3892 . So Nomad 0.11 can expect it to be supported. Though, some more rigorous testing is required before merging this. I want to call out some implementation details: First, the current connection pool reuses the Yamux stream for multiple RPC calls, and doesn't close them until an error is encountered. This commit doesn't change it, and sets the `RpcNomad` byte only at stream creation. Second, the StreamingRPC session gets closed by callers and cannot be reused. Every StreamingRPC opens a new Yamux session.	2020-02-03 19:31:39 -05:00
Michael Schurter	c82b14b0c4	core: add limits to unauthorized connections Introduce limits to prevent unauthorized users from exhausting all ephemeral ports on agents: * `{https,rpc}_handshake_timeout` * `{http,rpc}_max_conns_per_client` The handshake timeout closes connections that have not completed the TLS handshake by the deadline (5s by default). For RPC connections this timeout also separately applies to first byte being read so RPC connections with TLS enabled have `rpc_handshake_time * 2` as their deadline. The connection limit per client prevents a single remote TCP peer from exhausting all ephemeral ports. The default is 100, but can be lowered to a minimum of 26. Since streaming RPC connections create a new TCP connection (until MultiplexV2 is used), 20 connections are reserved for Raft and non-streaming RPCs to prevent connection exhaustion due to streaming RPCs. All limits are configurable and may be disabled by setting them to `0`. This also includes a fix that closes connections that attempt to create TLS RPC connections recursively. While only users with valid mTLS certificates could perform such an operation, it was added as a safeguard to prevent programming errors before they could cause resource exhaustion.	2020-01-30 10:38:25 -08:00
Drew Bailey	a61bf32314	Allow nomad monitor command to lookup server UUID Allows addressing servers with nomad monitor using the servers name or ID. Also unifies logic for addressing servers for client_agent_endpoint commands and makes addressing logic region aware. rpc getServer test	2020-01-29 13:55:29 -05:00
Drew Bailey	4bc68855d0	use intercepting loggers for rpchandlers	2019-11-05 09:51:50 -05:00
Mahmood Ali	d699a70875	Merge pull request #5911 from hashicorp/b-rpc-consistent-reads Block rpc handling until state store is caught up	2019-08-20 09:29:37 -04:00
Mahmood Ali	ad39bcef60	rpc: use tls wrapped connection for streaming rpc This ensures that server-to-server streaming RPC calls use the tls wrapped connections. Prior to this, `streamingRpcImpl` function uses tls for setting header and invoking the rpc method, but returns unwrapped tls connection. Thus, streaming writes fail with tls errors. This tls streaming bug existed since 0.8.0[1], but PR #5654[2] exacerbated it in 0.9.2. Prior to PR #5654, nomad client used to shuffle servers at every heartbeat -- `servers.Manager.setServers`[3] always shuffled servers and was called by heartbeat code[4]. Shuffling servers meant that a nomad client would heartbeat and establish a connection against all nomad servers eventually. When handling streaming RPC calls, nomad servers used these local connection to communicate directly to the client. The server-to-server forwarding logic was left mostly unexercised. PR #5654 means that a nomad client may connect to a single server only and caused the server-to-server forward streaming RPC code to get exercised more and unearthed the problem. [1] https://github.com/hashicorp/nomad/blob/v0.8.0/nomad/rpc.go#L501-L515 [2] https://github.com/hashicorp/nomad/pull/5654 [3] https://github.com/hashicorp/nomad/blob/v0.9.1/client/servers/manager.go#L198-L216 [4] https://github.com/hashicorp/nomad/blob/v0.9.1/client/client.go#L1603	2019-07-12 14:41:44 +08:00
Mahmood Ali	ea3a98357f	Block rpc handling until state store is caught up Here, we ensure that when leader only responds to RPC calls when state store is up to date. At leadership transition or launch with restored state, the server local store might not be caught up with latest raft logs and may return a stale read. The solution here is to have an RPC consistency read gate, enabled when `establishLeadership` completes before we respond to RPC calls. `establishLeadership` is gated by a `raft.Barrier` which ensures that all prior raft logs have been applied. Conversely, the gate is disabled when leadership is lost. This is very much inspired by https://github.com/hashicorp/consul/pull/3154/files	2019-07-02 16:07:37 +08:00
Chris Baker	121a9eb8cb	some changes for more idiomatic code	2018-12-12 23:11:17 +00:00
Chris Baker	34600f8b75	fixed bug in loop delay	2018-12-12 19:16:41 +00:00
Chris Baker	89c64932c1	gofmt	2018-12-12 19:09:06 +00:00
Chris Baker	22c11d8799	improved code for readability	2018-12-12 18:52:06 +00:00
Chris Baker	59beae35df	nomad/rpc listener: modified to throttle logging on "permanent" Accept() errors as well (with a higher delay cap)	2018-12-07 22:14:15 +00:00
Chris Baker	707bac0a7b	rpc accept loop: added backoff on logging for failed connections, in case there is a fast fail loop (NMD-1173)	2018-12-07 20:12:55 +00:00
Alex Dadgar	9971b3393f	yamux	2018-09-17 14:22:40 -07:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Xopherus	8d747578e8	Close multiplexer when context is cancelled Multiplexer continues to create rpc connections even when the context which is passed to the underlying rpc connections is cancelled by the server. This was causing #4413 - when a SIGHUP causes everything to reload, it uses context to cancel the underlying http/rpc connections so that they may come up with the new configuration. The multiplexer was not being cancelled properly so it would continue to create rpc connections and constantly fail, causing communication issues with other nomad agents. Fixes #4413	2018-08-13 19:32:49 -04:00
Alex Dadgar	7f28cfcdfe	small cleanup	2018-03-30 15:49:56 -07:00
Alex Dadgar	5dacb057b7	Only track nodes if the conn is from the node Fixes a bug in which a connection to a Nomad server was treated as a connection to a node because the server forwarded a node specific RPC.	2018-03-27 09:59:31 -07:00
Alex Dadgar	a1faab0e58	Server TLS	2018-02-15 15:03:12 -08:00
Alex Dadgar	5b9806590b	add logging	2018-02-15 13:59:03 -08:00
Alex Dadgar	64ad3119d0	Implement MultiplexV2 RPC handling Implements and tests the V2 multiplexer. This will not be used until several versions of Nomad have been released to mitigate upgrade concerns.	2018-02-15 13:59:02 -08:00
Alex Dadgar	cea77df6a7	Add Streaming RPC ack This PR introduces an ack allowing the receiving end of the streaming RPC to return any error that may have occured during the establishment of the streaming RPC.	2018-02-15 13:59:02 -08:00
Alex Dadgar	6c1fa878ea	Forwarding	2018-02-15 13:59:02 -08:00
Alex Dadgar	2c0ad26374	New RPC Modes and basic setup for streaming RPC handlers	2018-02-15 13:59:01 -08:00
Alex Dadgar	b5037f20db	Remove circular dependency	2018-02-15 13:59:01 -08:00
Alex Dadgar	3f786b904b	use server manager	2018-02-15 13:59:01 -08:00
Alex Dadgar	46770d57e5	Forwarding	2018-02-15 13:59:01 -08:00
Alex Dadgar	6dd1c9f49d	Refactor	2018-02-15 13:59:00 -08:00
Alex Dadgar	8058ab039f	Store the whole verified certificate chain	2018-02-15 13:59:00 -08:00
Alex Dadgar	13bbf3fbbb	Track client connections	2018-02-15 13:59:00 -08:00
Alex Dadgar	4243438661	Improve TLS cluster testing	2018-02-15 13:59:00 -08:00
Alex Dadgar	ba5ecb8c1a	Dynamic RPC servers with context	2018-02-15 13:59:00 -08:00
Chelsea Holland Komlo	3f34b59ee6	remove unnecessary nil checks; default case add tests for TLSConfig object	2018-01-08 09:24:28 -05:00
Chelsea Holland Komlo	d9ec538d6a	don't ignore error in http reloading code review feedback	2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo	6a2432659a	code review fixups	2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo	c0ad9a4627	add ability to upgrade/downgrade nomad agents tls configurations via sighup	2018-01-08 09:21:06 -05:00
Chelsea Komlo	2dfda33703	Nomad agent reload TLS configuration on SIGHUP (#3479 ) * Allow server TLS configuration to be reloaded via SIGHUP * dynamic tls reloading for nomad agents * code cleanup and refactoring * ensure keyloader is initialized, add comments * allow downgrading from TLS * initalize keyloader if necessary * integration test for tls reload * fix up test to assert success on reloaded TLS configuration * failure in loading a new TLS config should remain at current Reload only the config if agent is already using TLS * reload agent configuration before specific server/client lock keyloader before loading/caching a new certificate * introduce a get-or-set method for keyloader * fixups from code review * fix up linting errors * fixups from code review * add lock for config updates; improve copy of tls config * GetCertificate only reloads certificates dynamically for the server * config updates/copies should be on agent * improve http integration test * simplify agent reloading storing a local copy of config * reuse the same keyloader when reloading * Test that server and client get reloaded but keep keyloader * Keyloader exposes GetClientCertificate as well for outgoing connections * Fix spelling * correct changelog style	2017-11-14 17:53:23 -08:00
Chelsea Holland Komlo	e348deecf5	fixups from code review	2017-11-01 15:21:05 -05:00
Chelsea Holland Komlo	afe9f9a714	add rpc_upgrade_mode as config option for tls upgrades	2017-11-01 15:19:52 -05:00
Alex Dadgar	abfc56a871	WatchCtx propogates context error	2017-09-06 17:37:40 -07:00
Alex Dadgar	26e66ed1c5	fix checking of context error	2017-09-01 09:53:09 -07:00

1 2

88 Commits