open-nomad

Author	SHA1	Message	Date
Drew Bailey	aec81a0b99	api agent endpoints helper func to return serverPart based off of serverID	2020-01-09 15:15:05 -05:00
Drew Bailey	3672414888	test pprof headers and profile methods tidy up, add comments clean up seconds param assignment	2020-01-09 15:15:04 -05:00
Drew Bailey	fc37448683	warn when enabled debug is on when registering m -> a receiver name return codederrors, fix query	2020-01-09 15:15:04 -05:00
Drew Bailey	50288461c9	Server request forwarding for Agent.Profile Return rpc errors for profile requests, set up remote forwarding to target leader or server id for profile requests. server forwarding, endpoint tests	2020-01-09 15:15:03 -05:00
Mahmood Ali	d740d347ce	Migrate old alloc structs on read This commit ensures that Alloc.AllocatedResources is properly populated when read from persistence stores (namely Raft and client state store). The alloc struct may have been written previously by an arbitrary old version that may only populate Alloc.TaskResources.	2020-01-09 08:46:50 -05:00
James Rasell	df2dc48790	Fix error parsing config when setting consul.timeout. (#6907 ) When parsing a config file which had the consul.timeout param set, Nomad was reporting an error causing startup to fail. This seems to be caused by the HCL decoder interpreting the timeout type as an int rather than a string. This is caused by the struct TimeoutHCL param having a hcl key of timeout alongside a Timeout struct param of type time.Duration (int). Ensuring the decoder ignores the Timeout struct param ensure the decoder runs correctly.	2020-01-07 13:40:55 -05:00
Nick Ethier	677e9cdc16	connect: configure envoy such that multiple sidecars can run in the same alloc	2020-01-06 11:26:27 -05:00
Michael Schurter	92cdc9de01	nomad/state: remove dead upgrade path code It is uncalled so there hsould be no runtime changes.	2019-12-20 11:10:22 -08:00
Drew Bailey	d9e41d2880	docs for shutdown delay update docs, address pr comments ensure pointer is not nil use pointer for diff tests, set vs unset	2019-12-16 11:38:35 -05:00
Drew Bailey	ae145c9a37	allow only positive shutdown delay more explicit test case, remove select statement	2019-12-16 11:38:30 -05:00
Drew Bailey	24929776a2	shutdown delay for task groups copy struct values ensure groupserviceHook implements RunnerPreKillhook run deregister first test that shutdown times are delayed move magic number into variable	2019-12-16 11:38:16 -05:00
Seth Hoenig	270233e23d	tests: remove trace statements from nodeDrainWatcher.watch Avoid logging in the `watch` function as much as possible, since it is not waited on during a server shutdown. When the logger logs after a test passes, it may or may not cause the testing framework to panic. More info in: https://github.com/golang/go/issues/29388#issuecomment-453648436	2019-12-16 07:08:11 -06:00
Michael Schurter	95fd2643d7	connect: canonicalize before adding sidecar Fixes #6853 Canonicalize jobs first before adding any sidecars. This fixes a bug where sidecar tasks were added without interpolated names and broke validation. Sidecar tasks must be canonicalized independently. Also adds a group network to the mock connect job because it wasn't a valid connect job before!	2019-12-12 20:55:56 -08:00
Seth Hoenig	d45dec1ca8	tests: parallelize state store tests It has been decided we're going to live in a many core world. Let's take advantage of that and parallelize these state store tests which all run in memory and are largely CPU bound. An unscientific benchmark demonstrating the improvement: [mp state (master)] $ go test PASS ok github.com/hashicorp/nomad/nomad/state 5.162s [mp state (f-parallelize-state-store-tests)] $ go test PASS ok github.com/hashicorp/nomad/nomad/state 1.527s	2019-12-11 09:36:37 -06:00
Seth Hoenig	f0c3dca49c	tests: swap lib/freeport for tweaked helper/freeport Copy the updated version of freeport (sdk/freeport), and tweak it for use in Nomad tests. This means staying below port 10000 to avoid conflicts with the lib/freeport that is still transitively used by the old version of consul that we vendor. Also provide implementations to find ephemeral ports of macOS and Windows environments. Ports acquired through freeport are supposed to be returned to freeport, which this change now also introduces. Many tests are modified to include calls to a cleanup function for Server objects. This should help quite a bit with some flakey tests, but not all of them. Our port problems will not go away completely until we upgrade our vendor version of consul. With Go modules, we'll probably do a 'replace' to swap out other copies of freeport with the one now in 'nomad/helper/freeport'.	2019-12-09 08:37:32 -06:00
Danielle Lancashire	d2075ebae9	spellcheck: Fix spelling of retrieve	2019-12-05 18:59:47 -06:00
Mahmood Ali	b3e557cae3	address feedback review apply `s/requestAuthToken/requestACLToken/g`	2019-11-26 08:39:04 -05:00
Mahmood Ali	02e20c720b	acl_endpoint: permission denied for unauthenticated requests If ACL Request is unauthenticated, we should honor the anonymous token. This PR makes few changes: * `GetPolicy` endpoints may return policy if anonymous policy allows it, or return permission denied otherwise. * `ListPolicies` returns an empty policy list, or one with anonymous policy if one exists. Without this PR, the we return an incomprehensible error. Before: ``` $ curl http://localhost:4646/v1/acl/policy/doesntexist; echo acl token lookup failed: index error: UUID must be 36 characters $ curl http://localhost:4646/v1/acl/policies; echo acl token lookup failed: index error: UUID must be 36 characters ``` After: ``` $ curl http://localhost:4646/v1/acl/policy/doesntexist; echo Permission denied $ curl http://localhost:4646/v1/acl/policies; echo [] ```	2019-11-22 08:43:09 -05:00
Michael Schurter	4b6762511d	Merge pull request #6021 from hashicorp/f-anonymous-policy-access api: Update policy endpoint to permit anonymous access	2019-11-20 15:33:45 -08:00
Buck Doyle	5fcc00d0f9	Add gofmt changes	2019-11-20 12:47:01 -06:00
Buck Doyle	dc9c0d5ead	Add explanatory comment	2019-11-20 11:45:44 -06:00
Buck Doyle	bea9837510	Remove extraneous else block	2019-11-20 11:37:45 -06:00
Buck Doyle	d6a3e571bd	Remove extraneous whitespace	2019-11-20 11:37:01 -06:00
Buck Doyle	db77a24ed3	Merge branch 'master' into f-policy-json	2019-11-20 11:20:07 -06:00
Mahmood Ali	7fb4c35831	comments and casing	2019-11-19 16:03:55 -05:00
Mahmood Ali	97d0fd009d	404 if token isn't found	2019-11-19 15:52:53 -05:00
Mahmood Ali	6f8bb5e90b	api: acl bootstrap errors aren't 500 Noticed that ACL endpoints return 500 status code for user errors. This is confusing and can lead to false monitoring alerts. Here, I introduce a concept of RPCCoded errors to be returned by RPC that signal a code in addition to error message. Codes for now match HTTP codes to ease reasoning. ``` $ nomad acl bootstrap Error bootstrapping: Unexpected response code: 500 (ACL bootstrap already done (reset index: 9)) $ nomad acl bootstrap Error bootstrapping: Unexpected response code: 400 (ACL bootstrap already done (reset index: 9)) ```	2019-11-19 15:51:57 -05:00
Michael Schurter	796758b8a5	core: add semver constraint The existing version constraint uses logic optimized for package managers, not schedulers, when checking prereleases: - 1.3.0-beta1 will not satisfy ">= 0.6.1" - 1.7.0-rc1 will not satisfy ">= 1.6.0-beta1" This is due to package managers wishing to favor final releases over prereleases. In a scheduler versions more often represent the earliest release all required features/APIs are available in a system. Whether the constraint or the version being evaluated are prereleases has no impact on ordering. This commit adds a new constraint - `semver` - which will use Semver v2.0 ordering when evaluating constraints. Given the above examples: - 1.3.0-beta1 satisfies ">= 0.6.1" using `semver` - 1.7.0-rc1 satisfies ">= 1.6.0-beta1" using `semver` Since existing jobspecs may rely on the old behavior, a new constraint was added and the implicit Consul Connect and Vault constraints were updated to use it.	2019-11-19 08:40:19 -08:00
Nick Ethier	bd454a4c6f	client: improve group service stanza interpolation and check_re… (#6586 ) * client: improve group service stanza interpolation and check_restart support Interpolation can now be done on group service stanzas. Note that some task runtime specific information that was previously available when the service was registered poststart of a task is no longer available. The check_restart stanza for checks defined on group services will now properly restart the allocation upon check failures if configured.	2019-11-18 13:04:01 -05:00
Luiz Aoqui	e499c5bddc	Merge pull request #6698 from hashicorp/f-add-drain-start-time api: add `StartedAt` in `Node.DrainStrategy`	2019-11-15 15:38:38 -05:00
Luiz Aoqui	e862b61daa	api: use the same initial time for all drain properties	2019-11-14 16:06:09 -05:00
Drew Bailey	9b63828658	serverID to target remote leader or server handle the case where we request a server-id which is this current server update docs, error on node and server id params more accurate names for tests use shared no leader err, formatting rm bad comment remove redundant variable	2019-11-14 10:07:35 -05:00
Drew Bailey	b644e1f47d	add server-id to monitor specific server	2019-11-14 09:53:41 -05:00
Drew Bailey	2185c1a89e	Allows monitor to target leader server Allows user to pass in node-id=leader to forward monitor request to remote a remote leader.	2019-11-14 09:53:40 -05:00
Luiz Aoqui	5bd7cdd5c3	api: add `StartedAt` in `Node.DrainStrategy`	2019-11-13 17:54:40 -05:00
Lars Lehtonen	22a3c21dd0	nomad: fix dropped test error	2019-11-13 12:49:41 -08:00
Michael Schurter	08afb7d605	vault: allow overriding implicit vault constraint There's a bug in version parsing that breaks this constraint when using a prerelease enterprise version of Vault (eg 1.3.0-beta1+ent). While this does not fix the underlying bug it does provide a workaround for future issues related to the implicit constraint. Like the implicit Connect constraint: all implicit constraints should be overridable to allow users to workaround bugs or other factors should the need arise.	2019-11-12 12:26:36 -08:00
Mahmood Ali	c4c37cb42e	vault: check token_explicit_max_ttl as well Vault 1.2.0 deprecated `explicit_max_ttl` in favor of `token_explicit_max_ttl`.	2019-11-12 08:47:23 -05:00
Lars Lehtonen	adbab29228	nomad: TestEvalBroker_Dequeue_Empty_Timeout() proper goroutine error handling (#6657 )	2019-11-08 14:35:06 -05:00
Drew Bailey	7420446458	Merge pull request #6639 from hashicorp/return-after-forward return after request has been forwarded	2019-11-08 09:48:35 -05:00
Lars Lehtonen	39b68e0b88	TestEvalBroker_Dequeue_Blocked() proper goroutine error handling (#6651 ) TestEvalBroker_Dequeue_Blocked() improve test readability	2019-11-08 08:52:23 -05:00
Nick Ethier	e947aaed4f	nomad: fix bug that didn't allow for multiple connect services in same tg	2019-11-08 04:33:39 -05:00
Lars Lehtonen	6deae70e35	TestEvalBroker_PauseResumeNackTimeout() proper goroutine error handling (#6649 ) TestEvalBroker_PauseResumeNackTimeout() improve test readability	2019-11-07 16:04:59 -05:00
Lars Lehtonen	2638cbb31d	nomad: TestEvalBroker_EnqueueAll_Dequeue_Fair() proper goroutine error handling (#6636 ) nomad: TestEvalBroker_EnqueueAll_Dequeue_Fair() improve test readability	2019-11-07 10:39:29 -05:00
Drew Bailey	a5e2e1805f	return after request has been forwarded	2019-11-07 08:33:53 -05:00
Lars Lehtonen	e64f98837c	nomad: fix dropped error in TestJobEndpoint_Deregister_ACL (#6602 )	2019-11-06 16:40:45 -05:00
Drew Bailey	f4a7e3dc75	coordinate closing of doneCh, use interface to simplify callers comments	2019-11-05 11:44:26 -05:00
Drew Bailey	fe542680dc	log-json -> json fix typo command/agent/monitor/monitor.go Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com> Update command/agent/monitor/monitor.go Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com> address feedback, lock to prevent send on closed channel fix lock/unlock for dropped messages	2019-11-05 09:51:59 -05:00
Drew Bailey	298b8358a9	move forwarded monitor request into helper	2019-11-05 09:51:56 -05:00
Drew Bailey	8726b685de	address feedback	2019-11-05 09:51:56 -05:00
Drew Bailey	0e759c401c	moving endpoints over to frames	2019-11-05 09:51:54 -05:00
Drew Bailey	17d876d5ef	rename function, initialize log level better underscores instead of dashes for query params	2019-11-05 09:51:53 -05:00
Drew Bailey	8178beecf0	address feedback, use agent_endpoint instead of monitor	2019-11-05 09:51:53 -05:00
Drew Bailey	db65b1f4a5	agent:read acl policy for monitor	2019-11-05 09:51:52 -05:00
Drew Bailey	2533617888	rpc acl tests for both monitor endpoints	2019-11-05 09:51:51 -05:00
Drew Bailey	3c33747e1f	client monitor endpoint tests	2019-11-05 09:51:50 -05:00
Drew Bailey	4bc68855d0	use intercepting loggers for rpchandlers	2019-11-05 09:51:50 -05:00
Drew Bailey	3b9c33a5f0	new hclog with standardlogger intercept	2019-11-05 09:51:49 -05:00
Drew Bailey	a45ae1cd58	enable json formatting, use queryoptions	2019-11-05 09:51:49 -05:00
Drew Bailey	786989dbe3	New monitor pkg for shared monitor functionality Adds new package that can be used by client and server RPC endpoints to facilitate monitoring based off of a logger clean up old code small comment about write rm old comment about minsize rename to Monitor Removes connection logic from monitor command Keep connection logic in endpoints, use a channel to send results from monitoring use new multisink logger and interfaces small test for dropped messages update go-hclogger and update sink/intercept logger interfaces	2019-11-05 09:51:49 -05:00
Lars Lehtonen	0a4542fadc	nomad: fix test goroutine (#6593 )	2019-10-31 08:23:32 -04:00
Seth Hoenig	98592113a3	Merge pull request #6582 from hashicorp/b-vault-createToken-log-msg nomad: fix vault.CreateToken log message printing wrong error	2019-10-29 17:35:05 -05:00
Mahmood Ali	7f2e4dc5d8	Merge pull request #6574 from hashicorp/b-gh-6570-vault-role-validation vault: honor new `token_period` in vault token role	2019-10-29 10:18:59 -04:00
Seth Hoenig	838c6e3329	nomad: fix vault.CreateToken log message printing wrong error Fixes typo in word "failed". Fixes bug where incorrect error is printed. The old code would only ever print a nil error, instead of the validationErr which is being created.	2019-10-28 23:05:32 -05:00
Mahmood Ali	c5d8d66787	Fix admissionValidators `admissionValidators` doesn't aggregate errors correctly, as it aggregates errors in `errs` reference yet it always returns the nil `err`. Here, we avoid shadowing `err`, and move variable declarations to where they are used.	2019-10-28 10:52:53 -04:00
Mahmood Ali	abb930249a	consul connect: do basic validation before mutating job `groupConnectHook` assumes that Networks is a non-empty slice, but TG hasn't been validated yet and validation may depend on mutation results. As such, we do basic check here before dereferencing network slice elements.	2019-10-28 10:49:02 -04:00
Mahmood Ali	bb45a7a776	add tests for consul connect validation	2019-10-28 10:41:51 -04:00
Mahmood Ali	4c64658397	vault: Support new role field `token_role` Vault 1.2.0 deprecated `period` field in favor of `token_period` in auth role: > * Token store roles use new, common token fields for the values > that overlap with other auth backends. `period`, `explicit_max_ttl`, and > `bound_cidrs` will continue to work, with priority being given to the > `token_` prefixed versions of those parameters. They will also be returned > when doing a read on the role if they were used to provide values initially; > however, in Vault 1.4 if `period` or `explicit_max_ttl` is zero they will no > longer be returned. (`explicit_max_ttl` was already not returned if empty.) https://github.com/hashicorp/vault/blob/master/CHANGELOG.md#120-july-30th-2019	2019-10-28 09:33:26 -04:00
Seth Hoenig	8b03477f46	Merge pull request #6448 from hashicorp/f-set-connect-sidecar-tags connect: enable setting tags on consul connect sidecar service in job…	2019-10-17 15:14:09 -05:00
Seth Hoenig	039fbd3f3b	connect: enable setting tags on consul connect sidecar service in jobspec (#6415 )	2019-10-17 19:25:20 +00:00
Mahmood Ali	4e4a9b252c	Merge pull request #6290 from hashicorp/r-generated-code-refactor dev: avoid codecgen code in downstream projects	2019-10-15 08:22:31 -04:00
Danielle	fee482ae6c	Merge pull request #6331 from hashicorp/dani/f-volume-mount-propagation volumes: Add support for mount propagation	2019-10-14 14:29:40 +02:00
Danielle Lancashire	4fbcc668d0	volumes: Add support for mount propagation This commit introduces support for configuring mount propagation when mounting volumes with the `volume_mount` stanza on Linux targets. Similar to Kubernetes, we expose 3 options for configuring mount propagation: - private, which is equivalent to `rprivate` on Linux, which does not allow the container to see any new nested mounts after the chroot was created. - host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts that have been created _outside of the container_ to be visible inside the container after the chroot is created. - bidirectional, which is equivalent to `rshared` on Linux, which allows both the container to see new mounts created on the host, but importantly _allows the container to create mounts that are visible in other containers an don the host_ private and host-to-task are safe, but bidirectional mounts can be dangerous, as if the code inside a container creates a mount, and does not clean it up before tearing down the container, it can cause bad things to happen inside the kernel. To add a layer of safety here, we require that the user has ReadWrite permissions on the volume before allowing bidirectional mounts, as a defense in depth / validation case, although creating mounts should also require a priviliged execution environment inside the container.	2019-10-14 14:09:58 +02:00
Mahmood Ali	4b2ba62e35	acl: check ACL against object namespace Fix a bug where a millicious user can access or manipulate an alloc in a namespace they don't have access to. The allocation endpoints perform ACL checks against the request namespace, not the allocation namespace, and performs the allocation lookup independently from namespaces. Here, we check that the requested can access the alloc namespace regardless of the declared request namespace. Ideally, we'd enforce that the declared request namespace matches the actual allocation namespace. Unfortunately, we haven't documented alloc endpoints as namespaced functions; we suspect starting to enforce this will be very disruptive and inappropriate for a nomad point release. As such, we maintain current behavior that doesn't require passing the proper namespace in request. A future major release may start enforcing checking declared namespace.	2019-10-08 12:59:22 -04:00
Mahmood Ali	674a457865	use RequestNamespace(), the canonical way to get namespace	2019-09-27 07:40:58 -04:00
Mahmood Ali	e29ee4c400	nomad: defensive check for namespaces in job registration call In a job registration request, ensure that the request namespace "header" and job namespace field match. This should be the case already in prod, as http handlers ensures that the values match [1]. This mitigates bugs that exploit bugs where we may check a value but act on another, resulting into bypassing ACL system. [1] https://github.com/hashicorp/nomad/blob/v0.9.5/command/agent/job_endpoint.go#L415-L418	2019-09-26 17:02:47 -04:00
Lang Martin	fb41dd86ba	default raft protocol v2	2019-09-24 14:37:55 -04:00
Lang Martin	31d7f116dd	nomad/server comments	2019-09-24 14:36:18 -04:00
Tim Gross	cd9c23617f	client/connect: ConsulProxy LocalServicePort/Address (#6358 ) Without a `LocalServicePort`, Connect services will try to use the mapped port even when delivering traffic locally. A user can override this behavior by pinning the port value in the `service` stanza but this prevents us from using the Consul service name to reach the service. This commits configures the Consul proxy with its `LocalServicePort` and `LocalServiceAddress` fields.	2019-09-23 14:30:48 -04:00
Danielle Lancashire	78b61de45f	config: Hoist volume.config.source into volume Currently, using a Volume in a job uses the following configuration: ``` volume "alias-name" { type = "volume-type" read_only = true config { source = "host_volume_name" } } ``` This commit migrates to the following: ``` volume "alias-name" { type = "volume-type" source = "host_volume_name" read_only = true } ``` The original design was based due to being uncertain about the future of storage plugins, and to allow maxium flexibility. However, this causes a few issues, namely: - We frequently need to parse this configuration during submission, scheduling, and mounting - It complicates the configuration from and end users perspective - It complicates the ability to do validation As we understand the problem space of CSI a little more, it has become clear that we won't need the `source` to be in config, as it will be used in the majority of cases: - Host Volumes: Always need a source - Preallocated CSI Volumes: Always needs a source from a volume or claim name - Dynamic Persistent CSI Volumes: Always needs a source to attach the volumes to for managing upgrades and to avoid dangling. - Dynamic Ephemeral CSI Volumes: Less thought out, but `source` will probably point to the plugin name, and a `config` block will allow you to pass meta to the plugin. Or will point to a pre-configured ephemeral config. *If implemented The new design simplifies this by merging the source into the volume stanza to solve the above issues with usability, performance, and error handling.	2019-09-13 04:37:59 +02:00
Mahmood Ali	4b8280e51d	remove generated code	2019-09-06 19:24:15 +00:00
Nomad Release bot	dc7d728a82	Generate files for 0.10.0-beta1 release	2019-09-06 18:47:09 +00:00
Mahmood Ali	01f42053e4	dev: avoid codecgen code in downstream projects This is an attempt to ease dependency management for external driver plugins, by avoiding requiring them to compile ugorji/go generated files. Plugin developers reported some pain with the brittleness of ugorji/go dependency in particular, specially when using go mod, the default go mod manager in golang 1.13. Context -------- Nomad uses msgpack to persist and serialize internal structs, using ugorji/go library. As an optimization, we use ugorji/go code generation to speedup process and aovid the relection-based slow path. We commit these generated files in repository when we cut and tag the release to ease reproducability and debugging old releases. Thus, downstream projects that depend on release tag, indirectly depends on ugorji/go generated code. Sadly, the generated code is brittle and specific to the version of ugorji/go being used. When go mod picks another version of ugorji/go then nomad (go mod by default uses release according to semver), downstream projects face compilation errors. Interestingly, downstream projects don't commonly serialize nomad internal structs. Drivers and device plugins use grpc instead of msgpack for the most part. In the few cases where they use msgpag (e.g. decoding task config), they do without codegen path as they run on driver specific structs not the nomad internal structs. Also, the ugorji/go serialization through reflection is generally backward compatible (mod some ugorji/go regression bugs that get introduced every now and then :( ). Proposal --------- The proposal here is to keep committing ugorji/go codec generated files for releases but to use a go tag for them. All nomad development through the makefile, including releasing, CI and dev flow, has the tag enabled. Downstream plugin projects, by default, will skip these files and life proceed as normal for them. The downside is that nomad developers who use generated code but avoid using make must start passing additional go tag argument. Though this is not a blessed configuration.	2019-09-06 09:22:00 -04:00
Mahmood Ali	6d73ca0cfb	Merge pull request #6250 from hashicorp/f-raft-protocol-v3 Update default raft protocol to version 3	2019-09-04 09:34:41 -04:00
Mahmood Ali	c94a5ef1f8	tests: give up on TestAutopilot_CleanupStaleRaftServer for now	2019-09-04 09:10:53 -04:00
Nick Ethier	6a90a9f505	structs: canonicalize tg Services and Networks (#6257 )	2019-09-04 08:55:47 -04:00
Mahmood Ali	6cefd8f97e	tests: attempt to fix TestAutopilot_CleanupStaleRaftServer Also add a utility function for waiting for stable leadership	2019-09-04 08:49:33 -04:00
Mahmood Ali	035a7a94d9	tests: update time sensitive tests Fix tests whose messages seem timing dependent.	2019-09-04 08:45:25 -04:00
Mahmood Ali	0beb757b6f	tests: disable server auto join by default Tests typically call join cluster directly rather than rely on consul discovery. Worse, consul discovery seems to cause additional leadership transitions when a server is shutdown in tests than tests expect.	2019-09-04 07:54:54 -04:00
Mahmood Ali	3e2ab6e2a3	address review feedback	2019-09-03 21:44:39 -04:00
Mahmood Ali	0a6d73020c	use current nomad version in testing	2019-09-03 21:42:41 -04:00
Mahmood Ali	9bd56587cd	Fix raft tests Wait until leadership stabalizes and all non-voters get promoted before killing leader	2019-09-03 14:53:29 -04:00
Michael Schurter	5957030d18	connect: add unix socket to proxy grpc for envoy (#6232 ) * connect: add unix socket to proxy grpc for envoy Fixes #6124 Implement a L4 proxy from a unix socket inside a network namespace to Consul's gRPC endpoint on the host. This allows Envoy to connect to Consul's xDS configuration API. * connect: pointer receiver on structs with mutexes * connect: warn on all proxy errors	2019-09-03 08:43:38 -07:00
Buck Doyle	21ec6a237c	Merge branch 'master' into f-policy-json # Conflicts: # CHANGELOG.md	2019-09-03 09:56:25 -05:00
Jasmine Dahilig	4edebe389a	add default update stanza and max_parallel=0 disables deployments (#6191 )	2019-09-02 10:30:09 -07:00
Buck Doyle	ab96785fc9	Change test to use valid HCL for rules	2019-08-29 16:09:02 -05:00
Buck Doyle	4a159f5dcf	Change parsing error to set rules to nil	2019-08-29 15:50:34 -05:00
Buck Doyle	5495a7e689	Add standard error-handling for parse failure	2019-08-29 11:12:02 -05:00
Buck Doyle	8b06712d21	Merge branch 'master' into f-policy-json	2019-08-29 11:11:21 -05:00
Mahmood Ali	3da10b5cb3	scheduler: tests for multiple drivers in TG	2019-08-29 09:03:31 -04:00
Mahmood Ali	a67f5f0565	update tests to run with v2	2019-08-28 16:42:08 -04:00
Mahmood Ali	6eabf53b91	Default raft protocol to version 3	2019-08-28 15:56:59 -04:00
Michael Schurter	f5792635ca	Merge pull request #6218 from hashicorp/f-consul-defaults consul: use Consul's defaults and env vars	2019-08-28 11:54:44 -07:00
Nick Ethier	9e96971a75	cli: display group ports and address in alloc status command output (#6189 ) * cli: display group ports and address in alloc status command output * add assertions for port.To = -1 case and convert assertions to testify	2019-08-27 23:59:36 -04:00
Nick Ethier	cbb27e74bc	Add environment variables for connect upstreams (#6171 ) * taskenv: add connect upstream env vars + test * set taskenv upstreams instead of appending * Update client/taskenv/env.go Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>	2019-08-27 23:41:38 -04:00
Michael Schurter	3b0e1d8ef7	consul: use Consul's defaults and env vars Use Consul's API package defaults and env vars as Nomad's defaults.	2019-08-27 14:56:52 -07:00
Mahmood Ali	3791a70aa9	Merge pull request #5676 from hashicorp/f-b-upgrade-ugorji-dep-20190508 Update ugorji/go to latest	2019-08-23 18:29:49 -04:00
Jerome Gravel-Niquet	cbdc1978bf	Consul service meta (#6193 ) * adds meta object to service in job spec, sends it to consul * adds tests for service meta * fix tests * adds docs * better hashing for service meta, use helper for copying meta when registering service * tried to be DRY, but looks like it would be more work to use the helper function	2019-08-23 12:49:02 -04:00
Michael Schurter	95b8048553	Merge pull request #6121 from hashicorp/f-connect-bootstrap connect: task hook for bootstrapping envoy sidecar	2019-08-22 10:58:31 -07:00
Michael Schurter	59e0b67c7f	connect: task hook for bootstrapping envoy sidecar Fixes #6041 Unlike all other Consul operations, boostrapping requires Consul be available. This PR tries Consul 3 times with a backoff to account for the group services being asynchronously registered with Consul.	2019-08-22 08:15:32 -07:00
Danielle Lancashire	2e5f28029f	remove hidden field from host volumes We're not shipping support for "hidden" volumes in 0.10 any more, I'll convert this to an issue+mini RFC for future enhancement.	2019-08-22 08:48:05 +02:00
Danielle	0428284aee	Merge pull request #6180 from hashicorp/dani/readonly-acl Fine grained ACLs for Host Volumes	2019-08-21 22:22:14 +02:00
Danielle Lancashire	91bb67f713	acls: Break mount acl into mount-rw and mount-ro	2019-08-21 21:17:30 +02:00
Nick Ethier	c8556daf37	structs: validate no tcp checks for connect services (#6169 )	2019-08-21 12:42:53 -04:00
Michael Schurter	050cc32fde	Merge pull request #6157 from hashicorp/f-connect-register Register connect enabled group services with Consul	2019-08-20 14:45:38 -07:00
Tim Gross	7dc6ee2d27	structs: add taskgroup networks and services to plan diffs Adds a check for differences in `job.Diff` so that task group networks and services, including new Consul connect stanzas, show up in the job plan outputs.	2019-08-20 16:18:30 -04:00
Michael Schurter	b008fd1724	connect: register group services with Consul Fixes #6042 Add new task group service hook for registering group services like Connect-enabled services. Does not yet support checks.	2019-08-20 12:25:10 -07:00
Tim Gross	a0e923f46c	add optional task field to group service checks	2019-08-20 09:35:31 -04:00
Mahmood Ali	d699a70875	Merge pull request #5911 from hashicorp/b-rpc-consistent-reads Block rpc handling until state store is caught up	2019-08-20 09:29:37 -04:00
Nick Ethier	24f5a4c276	sidecar_task override in connect admission controller (#6140 ) * structs: use seperate SidecarTask struct for sidecar_task stanza and add merge * nomad: merge SidecarTask into proxy task during connect Mutate hook	2019-08-20 01:22:46 -04:00
Nick Ethier	965f00b2fc	Builtin Admission Controller Framework (#6116 ) * nomad: add admission controller framework * nomad: add admission controller framework and Consul Connect hooks * run admission controllers before checking permissions * client: add default node meta for connect configurables * nomad: remove validateJob func since it has been moved to admission controller * nomad: use new TaskKind type * client: use consts for connect sidecar image and log level * Apply suggestions from code review Co-Authored-By: Michael Schurter <mschurter@hashicorp.com> * nomad: add job register test with connect sidecar * Update nomad/job_endpoint_hooks.go Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>	2019-08-15 11:22:37 -04:00
Preetha Appan	72e45dd01e	More code review feedback	2019-08-12 17:41:40 -05:00
Preetha	76c8a11b31	Apply suggestions from code review Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>	2019-08-12 17:03:30 -05:00
Preetha Appan	219dc05541	Fix type for kind	2019-08-12 14:39:50 -05:00
Preetha Appan	35506c516d	Improve validation logic and add table driven tests	2019-08-12 14:39:50 -05:00
Preetha Appan	d324a9864e	Add validation for kind field if it is a consul connect proxy	2019-08-12 14:39:50 -05:00
Danielle Lancashire	b38c1d810e	job_endpoint: Validate volume permissions	2019-08-12 15:39:09 +02:00
Danielle Lancashire	33db40d4e6	structs: Document VolumeMount	2019-08-12 15:39:08 +02:00
Danielle Lancashire	861caa9564	HostVolumeConfig: Source -> Path	2019-08-12 15:39:08 +02:00
Danielle Lancashire	e132a30899	structs: Unify Volume and VolumeRequest	2019-08-12 15:39:08 +02:00
Danielle Lancashire	6d7b417e54	structs: Add declarations of basic structs for volume support	2019-08-12 15:39:08 +02:00
Nick Ethier	1871c1edbc	Add sidecar_task stanza parsing (#6104 ) * jobspec: breakup parse.go into smaller files * add sidecar_task parsing to jobspec and api * jobspec: combine service parsing logic for task and group service stanzas * api: use slice of ConsulUpstream values instead of pointers	2019-08-09 15:18:53 -04:00
Preetha Appan	a393ea79e8	Add field "kind" to task for use in connect tasks	2019-08-07 18:43:36 -05:00
Jasmine Dahilig	8d980edd2e	add create and modify timestamps to evaluations (#5881 )	2019-08-07 09:50:35 -07:00
Michael Schurter	3e4796799a	Merge pull request #6003 from pete-woods/add-job-status-metrics nomad: add job status metrics	2019-08-07 08:02:16 -07:00
Michael Schurter	d2862b33e6	Merge pull request #6045 from hashicorp/f-connect-groupservice consul: add Connect structs	2019-08-06 15:43:38 -07:00
Michael Schurter	ef9d100d2f	Merge pull request #6082 from hashicorp/b-vault-deadlock vault: fix deadlock in SetConfig	2019-08-06 15:30:17 -07:00
Michael Schurter	ecb1a65bb9	Merge pull request #6077 from hashicorp/b-vault-revlock vault: fix race in accessor revocations	2019-08-06 14:28:47 -07:00
Michael Schurter	b8e127b3c0	vault: ensure SetConfig calls are serialized This is a defensive measure as SetConfig should only be called serially.	2019-08-06 11:17:10 -07:00
Michael Schurter	5022341b27	vault: fix deadlock in SetConfig This seems to be the minimum viable patch for fixing a deadlock between establishConnection and SetConfig. SetConfig calls tomb.Kill+tomb.Wait while holding v.lock. establishConnection needs to acquire v.lock to exit but SetConfig is holding v.lock until tomb.Wait exits. tomb.Wait can't exit until establishConnect does! ``` SetConfig -> tomb.Wait ^ \| \| v v.lock <- establishConnection ```	2019-08-06 10:40:14 -07:00
Michael Schurter	17fd82d6ad	consul: add Connect structs Refactor all Consul structs into {api,structs}/services.go because api/tasks.go didn't make sense anymore and structs/structs.go is gigantic.	2019-08-06 08:15:07 -07:00
Michael Schurter	d0a83eb818	vault: fix race in accessor revocations	2019-08-05 15:08:04 -07:00
Preetha Appan	8b298621ef	Add more comments to clarify job.Stable field	2019-08-05 15:00:53 -05:00
Preetha Appan	e6a496bac0	Code review feedback	2019-07-31 01:04:08 -04:00
Preetha Appan	99eca85206	Scheduler changes to support network at task group level Also includes unit tests for binpacker and preemption. The tests verify that network resources specified at the task group level are properly accounted for	2019-07-31 01:04:08 -04:00
Michael Schurter	4501fe3c4d	structs: deepcopy shared alloc resources Also DRY up Networks code by using Networks.Copy	2019-07-31 01:04:06 -04:00
Michael Schurter	fb487358fb	connect: add group.service stanza support	2019-07-31 01:04:05 -04:00
Nick Ethier	a03f6a95a2	structs: refactor network validation to seperate fn	2019-07-31 01:03:16 -04:00
Danielle	1e7571eb85	fix structs comment Co-Authored-By: nickethier <ncethier@gmail.com>	2019-07-31 01:03:16 -04:00
Nick Ethier	aa7c08679e	structs: Add validations for task group networks	2019-07-31 01:03:16 -04:00
Nick Ethier	6c160df689	fix tests from introducing new struct fields	2019-07-31 01:03:16 -04:00
Nick Ethier	8650429e38	Add network stanza to group Adds a network stanza and additional options to the task group level in prep for allowing shared networking between tasks of an alloc.	2019-07-31 01:03:12 -04:00
Preetha Appan	d048029b5a	remove generated code and change version to 0.10.0	2019-07-30 15:56:05 -05:00
Nomad Release bot	e39fb11531	Generate files for 0.9.4 release	2019-07-30 19:05:18 +00:00
Buck Doyle	0a1a0419cb	Combine conditionals	2019-07-29 10:38:07 -05:00
Buck Doyle	0a082c1e5e	Update assertion to use better failure-reporting	2019-07-29 10:35:07 -05:00
Buck Doyle	c3deb7703d	Update policy endpoint to permit anonymous access	2019-07-26 13:07:42 -05:00
Pete Woods	9096aa3d23	Add job status metrics This avoids having to write services to repeatedly hit the jobs API	2019-07-26 10:12:49 +01:00
Buck Doyle	77f5a38c8f	Add parsed rules to policy response	2019-07-25 10:43:57 -05:00
Preetha Appan	6b4c40f5a8	remove generated code	2019-07-23 12:07:49 -05:00
Nomad Release bot	04187c8b86	Generate files for 0.9.4-rc1 release	2019-07-22 21:42:36 +00:00
Jasmine Dahilig	2157f6ddf1	add formatting for hcl parsing error messages (#5972 )	2019-07-19 10:04:39 -07:00
Lang Martin	f282da4ced	blocked_evals_test disable calls Flush	2019-07-18 10:32:13 -04:00
Lang Martin	8f7a20839e	worker comment system -> core	2019-07-18 10:32:13 -04:00
Lang Martin	83d20169f6	blocked_evals reset system evals on Flush	2019-07-18 10:32:13 -04:00
Lang Martin	6e3425babf	blocked_evals_test Test_UnblockNode	2019-07-18 10:32:12 -04:00
Lang Martin	ea275d5ce7	fsm attach UnblockNode on node updates	2019-07-18 10:32:12 -04:00
Lang Martin	3bf618f217	blocked_evals system evals indexed by job and node	2019-07-18 10:32:12 -04:00
Michael Schurter	81b4b6f19b	Merge pull request #5791 from hashicorp/b-plan-snapshotindex nomad: include snapshot index when submitting plans	2019-07-17 09:25:00 -07:00
Mahmood Ali	ad39bcef60	rpc: use tls wrapped connection for streaming rpc This ensures that server-to-server streaming RPC calls use the tls wrapped connections. Prior to this, `streamingRpcImpl` function uses tls for setting header and invoking the rpc method, but returns unwrapped tls connection. Thus, streaming writes fail with tls errors. This tls streaming bug existed since 0.8.0[1], but PR #5654[2] exacerbated it in 0.9.2. Prior to PR #5654, nomad client used to shuffle servers at every heartbeat -- `servers.Manager.setServers`[3] always shuffled servers and was called by heartbeat code[4]. Shuffling servers meant that a nomad client would heartbeat and establish a connection against all nomad servers eventually. When handling streaming RPC calls, nomad servers used these local connection to communicate directly to the client. The server-to-server forwarding logic was left mostly unexercised. PR #5654 means that a nomad client may connect to a single server only and caused the server-to-server forward streaming RPC code to get exercised more and unearthed the problem. [1] https://github.com/hashicorp/nomad/blob/v0.8.0/nomad/rpc.go#L501-L515 [2] https://github.com/hashicorp/nomad/pull/5654 [3] https://github.com/hashicorp/nomad/blob/v0.9.1/client/servers/manager.go#L198-L216 [4] https://github.com/hashicorp/nomad/blob/v0.9.1/client/client.go#L1603	2019-07-12 14:41:44 +08:00
Mahmood Ali	9c9bec62fd	rpc: add positive tests for server streaming RPC	2019-07-12 14:32:52 +08:00
Lang Martin	0b97175a16	node_endpoint preserve both messages as rpcs and in raft	2019-07-10 13:56:20 -04:00
Lang Martin	ee4848167c	core_sched add compat comment for later removal	2019-07-10 13:56:20 -04:00
Lang Martin	c13c97c6c2	structs drop deprecation warning, revert unnecessary comment change	2019-07-10 13:56:20 -04:00
Lang Martin	a95225d754	NodeDeregisterBatch -> NodeBatchDeregister match JobBatch pattern	2019-07-10 13:56:20 -04:00
Lang Martin	a8e72a5b68	state_store error if called without node_ids	2019-07-10 13:56:20 -04:00
Lang Martin	44cbca9b98	fsm new NodeDeregisterBatchRequestType sorted at the end of the case	2019-07-10 13:56:20 -04:00
Lang Martin	91e139dcb5	structs NodeDeregisterBatchRequestType must go at the end	2019-07-10 13:56:20 -04:00
Lang Martin	1cc6b4062c	fsm label batch_deregister_node metrics explicitly Co-Authored-By: Mahmood Ali <mahmood@notnoop.com>	2019-07-10 13:56:20 -04:00
Lang Martin	ad3549f906	core_sched use the new rpc names	2019-07-10 13:56:20 -04:00
Lang Martin	ce0f03651a	fsm support new NodeDeregisterBatchRequest	2019-07-10 13:56:20 -04:00
Lang Martin	fa5649998e	node endpoint support new NodeDeregisterBatchRequest	2019-07-10 13:56:19 -04:00
Lang Martin	683ab8d1d2	structs add NodeDeregisterBatchRequest	2019-07-10 13:56:19 -04:00
Lang Martin	82349aba5d	node_endpoint argument setup	2019-07-10 13:56:19 -04:00
Lang Martin	6dbf5d7d13	fsm return an error on both NodeDeregisterRequest fields set	2019-07-10 13:56:19 -04:00
Lang Martin	fbc78ba96c	fsm variable names for consistency	2019-07-10 13:56:19 -04:00
Lang Martin	09fd05bd8f	node_endpoint raft store then shutdown, test deprecation	2019-07-10 13:56:19 -04:00
Lang Martin	4610c70777	util simplify partitionAll	2019-07-10 13:56:19 -04:00
Lang Martin	d22d9fb5b2	core_sched check ServersMeetMinimumVersion	2019-07-10 13:56:19 -04:00
Lang Martin	3bf41211fb	fsm honor new and old style NodeDeregisterRequests	2019-07-10 13:56:19 -04:00
Lang Martin	3fb82e83a5	structs add back NodeDeregisterRequest.NodeID, compatibility	2019-07-10 13:56:19 -04:00
Lang Martin	a4472e3d34	core_sched check ServersMeetMinimumVersion, send old node deregister	2019-07-10 13:56:19 -04:00
Lang Martin	8e53c105fc	state_store just one index update, test deletion	2019-07-10 13:56:19 -04:00
Lang Martin	3e2d1f0338	node_endpoint improve error messages	2019-07-10 13:56:19 -04:00
Lang Martin	5a6a947e98	state_store improve error messages	2019-07-10 13:56:19 -04:00
Lang Martin	fd14cedf95	drainer watch_nodes_test batch of 1	2019-07-10 13:56:19 -04:00
Lang Martin	b176066d42	node_endpoint deregister the batch of nodes	2019-07-10 13:56:19 -04:00
Lang Martin	a97407e030	fsm NodeDeregisterRequest is now a batch	2019-07-10 13:56:19 -04:00
Lang Martin	d5ff2834ca	core_sched batch node deregistration requests	2019-07-10 13:56:19 -04:00
Lang Martin	10848841be	util partitionAll for paging	2019-07-10 13:56:19 -04:00

... 2 3 4 5 6 ...

3149 commits