open-nomad

Author	SHA1	Message	Date
Dennis Schön	c376545f49	don't prefix json logging	2021-01-20 09:09:31 -05:00
Kris Hicks	d71a90c8a4	Fix some errcheck errors (#9811 ) * Throw away result of multierror.Append When given a multierror.Error, it is mutated, therefore the return value is not needed. Simplify MergeMultierrorWarnings, use StringBuilder * Hash.Write() never returns an error * Remove error that was always nil * Remove error from Resources.Add signature When this was originally written it could return an error, but that was refactored away, and callers of it as of today never handle the error. * Throw away results of io.Copy during Bridge * Handle errors when computing node class in test	2021-01-14 12:46:35 -08:00
Nomad Release bot	f36d983863	Generate files for 1.0.2 release	2021-01-13 16:52:51 +00:00
Dave May	35d43c19ab	nomad agent-info: Add json/gotemplate formatting (#9788 ) * nomad agent-info: Add json/gotemplate formatting * Add CHANGELOG entry * update docs	2021-01-13 09:42:46 -05:00
Seth Hoenig	36da162619	command: generate bindata assetfs	2021-01-10 17:09:08 -06:00
Seth Hoenig	456868c166	command: bump connect examples to v3 Nomad v1.0+ combined with Consul 1.9+ support launching Envoy v1.16+ which is the first version of envoy to support arm64 platforms out of the box. By rebuilding our example docker containers for connect to be multiplatform between amd64 and arm64, Nomad can provide a nicer user experience for those trying out Connect on arm64 machines (e.g. AWS Graviton instances or Raspberry Pi 4's). This has been done for the countdash examples at v3. https://hub.docker.com/layers/hashicorpnomad/counter-dashboard/v3/images/sha256-94e323587bc372ba1b6ca5c58dc23e291e9d26787b50e71025f1c8967dfbcd07?context=repo https://hub.docker.com/layers/hashicorpnomad/counter-api/v3/images/sha256-16a9e9e08082985a635c9edd0f258b084153c6c7831a9b41d34bde78c308b65c?context=repo The connect-native examples are now also multiplatform at v5, but we don't have them built into `job init`.	2021-01-10 16:54:31 -06:00
Nick Ethier	6705f845f2	Merge pull request #9739 from hashicorp/b-alloc-netmode-ports Use port's to value when building service address under 'alloc' addr_mode	2021-01-07 09:16:27 -05:00
Nick Ethier	bb060bd46b	command/agent/consul: remove duplicated tests	2021-01-06 14:11:31 -05:00
Kris Hicks	868ba0cea5	consul: Refactor parts of UpdateWorkload (#9737 ) This removes modification of ops in methods that UpdateWorkload calls, keeping them local to UpdateWorkload. It also includes some rewrites of checkRegs for clarity.	2021-01-06 11:11:28 -08:00
Nick Ethier	ab01e19df3	command/agent/consul: use port's to value when building service address under 'alloc' addr_mode	2021-01-06 13:52:48 -05:00
Kris Hicks	be6580bd28	Refactor alloc exec command (#9718 ) This re-arranges the alloc exec Run implementation to have validation hoisted as high as possible.	2021-01-05 09:33:04 -08:00
Seth Hoenig	6c9366986b	consul/connect: avoid NPE from unset connect gateway proxy Submitting a job with an ingress gateway in host networking mode with an absent gateway.proxy block would cause the Nomad client to panic on NPE. The consul registration bits would assume the proxy stanza was not nil, but it could be if the user does not supply any manually configured envoy proxy settings. Check the proxy field is not nil before using it. Fixes #9669	2021-01-05 09:27:01 -06:00
Dave May	0dd2d8944f	Debug test refactor (#9637 ) * debug: refactor test cases * debug: remove unnecessary syncbuffer resets * debug: cleaned up test code per suggestions * debug: clarify note on parallel testing	2020-12-15 13:51:41 -05:00
Seth Hoenig	f2cafa07f5	Merge pull request #9633 from hashicorp/b-undo-429-connlimit agent: revert use of http connlimit	2020-12-15 11:02:14 -06:00
Seth Hoenig	40d36fc0ec	agent: revert use of http connlimit https://github.com/hashicorp/nomad/pull/9608 introduced the use of the built-in HTTP 429 response handler provided by go-connlimit. There is concern though around plausible DOS attacks that need to be addressed, so this PR reverts that functionality. It keeps a fix in the tests around the use of an HTTPS enabled client for when the server is listening on HTTPS. Previously, the tests would fail deterministically with io.EOF because that's how the TLS server terminates invalid connections. Now, the result is much less deterministic. The state of the client connection and the server socket depends on when the connection is closed and how far along the handshake was.	2020-12-14 14:40:14 -06:00
Dave May	5f50c1d0c1	debug: Fix node count bug from GH-9566 (#9625 ) * debug: update test to identify bug in GH-9566 * debug: range tests need fresh cmd each iteration * debug: fix node count bug in GH-9566	2020-12-14 15:02:48 -05:00
Seth Hoenig	288c838843	Merge pull request #9620 from hashicorp/f-remove-consul-flags command: remove use of flag impls from consul	2020-12-14 10:33:35 -06:00
Seth Hoenig	0091325721	command: give flag-helpers a better name	2020-12-14 10:07:27 -06:00
Mahmood Ali	3d8fa799c7	actually pass env-vars to hcl2 parsing	2020-12-11 13:32:54 -05:00
Seth Hoenig	9ec1af5310	command: remove use of flag impls from consul In a few places Nomad was using flag implementations directly from Consul, lending to Nomad's need to import consul. Replace those uses with helpers already in Nomad, and copy over the bare minimum needed to make the autopilot flags behave as they have.	2020-12-11 07:58:20 -06:00
Seth Hoenig	d3f1c3adcf	Merge pull request #9608 from hashicorp/f-go-connlimit Use go-connlimit to ratelimit with 429 responses	2020-12-10 11:05:07 -06:00
Seth Hoenig	a28cd45988	client: fix plumbing of testing object into helper	2020-12-10 11:04:38 -06:00
Kris Hicks	0cf9cae656	Apply some suggested fixes from staticcheck (#9598 )	2020-12-10 07:29:18 -08:00
Seth Hoenig	2cc5787f97	client: fix https test cases in client rate limits	2020-12-10 09:20:28 -06:00
Ben Buzbee	a8e4aa76c6	Use new go-connlimit with HTTP 429 response This is essentially a port of Consul's similar fix Changes are: go get -u github.com/hashicorp/go-connlimit go mod vendor Use new HTTP429 handler `20d1ea7d2d`	2020-12-09 17:57:16 -06:00
Mahmood Ali	9a3da14a57	actually honor var files (#9592 ) Apparently, we missed passing VarFile argument, so var files were ignored. Fixes #9588	2020-12-09 14:12:48 -05:00
Kris Hicks	0a3a748053	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
Kris Hicks	93155ba3da	Add gocritic to golangci-lint config (#9556 )	2020-12-08 12:47:04 -08:00
Dennis Schön	038f2cce57	return 405 on non-GET requests to /v1/event/stream (fixes #9526 ) (#9564 )	2020-12-08 13:09:20 -05:00
Dave May	be0a14d70b	fix AgentHostRequest panic found in GH-9546 (#9554 ) * debug: refactor nodeclass test * debug: add case to track down SIGSEGV on client to server Agent.Host RPC * verify server to avoid panic on AgentHostRequest RPC call, fixes GH-9546 * simplify Agent.Host RPC lookup logic	2020-12-07 17:34:40 -05:00
Dennis Schön	a9c97d9257	use os.ErrDeadlineExceeded in tests	2020-12-07 10:40:28 -05:00
Dave May	e045bd3a5e	nomad operator debug - add pprof duration / csi details (#9346 ) * debug: add pprof duration CLI argument * debug: add CSI plugin details * update help text with ACL requirements * debug: provide ACL hints upon permission failures * debug: only write file when pprof retrieve is successful * debug: add helper function to clean bad characters from dynamic filenames * debug: ensure files are unable to escape the capture directory	2020-12-01 12:36:05 -05:00
Drew Bailey	9adca240f8	Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447 ) * upsertaclpolicies * delete acl policies msgtype * upsert acl policies msgtype * delete acl tokens msgtype * acl bootstrap msgtype wip unsubscribe on token delete test that subscriptions are closed after an ACL token has been deleted Start writing policyupdated test * update test to use before/after policy * add SubscribeWithACLCheck to run acl checks on subscribe * update rpc endpoint to use broker acl check * Add and use subscriptions.closeSubscriptionFunc This fixes the issue of not being able to defer unlocking the mutex on the event broker in the for loop. handle acl policy updates * rpc endpoint test for terminating acl change * add comments Co-authored-by: Kris Hicks <khicks@hashicorp.com>	2020-12-01 11:11:34 -05:00
Drew Bailey	a0b7f05a7b	Remove Managed Sinks from Nomad (#9470 ) * Remove Managed Sinks from Nomad Managed Sinks were a beta feature in Nomad 1.0-beta2. During the beta period it was determined that this was not a scalable approach to support community and third party sinks. * update comment * changelog	2020-11-30 14:00:31 -05:00
Seth Hoenig	fa6789a087	Merge pull request #9472 from hashicorp/f-connect-upstream-datacenter consul/connect: enable setting datacenter in upstream	2020-11-30 12:29:05 -06:00
Seth Hoenig	e81e9223ef	consul/connect: enable setting datacenter in connect upstream Before, upstreams could only be defined using the default datacenter. Now, the `datacenter` field can be set in a connect upstream definition, informing consul of the desire for an instance of the upstream service in the specified datacenter. The field is optional and continues to default to the local datacenter. Closes #8964	2020-11-30 10:38:30 -06:00
Chris Piwarski	955bf23e8e	command: Fix node help error	2020-11-30 11:12:06 -05:00
Tim Gross	4e79ddea45	csi/api: populate ReadAllocs/WriteAllocs fields (#9377 ) The API is missing values for `ReadAllocs` and `WriteAllocs` fields, resulting in allocation claims not being populated in the web UI. These fields mirror the fields in `nomad/structs.CSIVolume`. Returning a separate list of stubs for read and write would be ideal, but this can't be done without either bloating the API response with repeated full `Allocation` data, or causing a panic in previous versions of the CLI. The `nomad/structs` fields are persisted with nil values and are populated during RPC, so we'll do the same in the HTTP API and populate the `ReadAllocs` and `WriteAllocs` fields with a map of allocation IDs, but with null values. The web UI will then create its `ReadAllocations` and `WriteAllocations` fields by mapping from those IDs to the values in `Allocations`, instead of flattening the map into a list.	2020-11-25 16:44:06 -05:00
Seth Hoenig	3c17dc2ecc	api: safely access legacy MBits field	2020-11-23 10:36:10 -06:00
Nick Ethier	c9bd7e89ca	command: use correct port mapping syntax in examples	2020-11-23 10:25:30 -06:00
Nick Ethier	d21cbeb30f	command: remove task network usage from init examples	2020-11-23 10:25:11 -06:00
Nick Ethier	9471892df4	mock: add default host network	2020-11-23 10:11:00 -06:00
Tim Gross	38120123c5	docs: add missing command documentation (#9415 ) * `nomad operator keyring` was missing the general options section * `nomad operator metrics` was missing a page in the docs entirely Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2020-11-23 08:10:10 -05:00
Tim Gross	f1ad512986	docs: describe required ACLs for all commands	2020-11-20 13:38:29 -05:00
Tim Gross	de6b023af2	command: remove -namespace from help options when not applicable	2020-11-19 16:28:39 -05:00
Tim Gross	d67afa2e21	docs/help: -no-color does not apply to alloc logs content The `nomad alloc logs` command does not remove terminal escape sequences for color from the log outputs of a task. Clarify that the standard `-no-color` flag, which does apply to Nomad's error responses from `nomad alloc logs`, does not apply to the log output.	2020-11-19 15:29:12 -05:00
Chris Baker	afe6fa82c2	Merge pull request #9368 from hashicorp/b-9367-duped-ports fix duplication of ports in `AllocatedResources`	2020-11-16 11:41:34 -06:00
Tim Gross	a788557579	csi: alloc status -verbose should query volume request 'source' (#9354 ) The `nomad alloc status -verbose` command returns a 404 from CSI volumes because the volume mount block in the task points back to the `job.group.volume` block. So using the `Name` field to query is the "name" as seen in the jobspec, and not the name of the volume that we need for querying. Show both the job-specific name and the volume ID in the resulting output, which clarifies the difference between the two fields and is more consistent with the web UI.	2020-11-16 08:28:52 -05:00
Chris Baker	7c6071e6c4	updated alloc_endpoint to mutate a copy of the returned allocation, instead of the instance in the state store	2020-11-15 17:52:50 +00:00
Chris Baker	b244d5e949	documenting test for #9367	2020-11-15 17:47:50 +00:00
Seth Hoenig	4cc3c01d5b	Merge pull request #9352 from hashicorp/f-artifact-headers jobspec: add support for headers in artifact stanza	2020-11-13 14:04:27 -06:00
Seth Hoenig	bb8a5816a0	jobspec: add support for headers in artifact stanza This PR adds the ability to set HTTP headers when downloading an artifact from an `http` or `https` resource. The implementation in `go-getter` is such that a new `HTTPGetter` must be created for each artifact that sets headers (as opposed to conveniently setting headers per-request). This PR maintains the memoization of the default Getter objects, creating new ones only for artifacts where headers are set. Closes #9306	2020-11-13 12:03:54 -06:00
Dave May	e89302aa4b	nomad operator debug - add client node filtering arguments (#9331 ) * operator debug - add client node filtering arguments * add WaitForClient helper function * use RPC in WaitForClient to avoid unnecessary imports * guard against nil values * move initialization up and shorten test duration * cleanup nodeLookupFailCount logic * only display max node notice if we actually tried to capture nodes	2020-11-12 11:25:28 -05:00
Seth Hoenig	b19bc6be2b	consul: prevent re-registration churn by correctly comparing sidecar tags Previously, connect sidecars would be re-registered with consul every cycle of Nomad's reconciliation loop around Consul service registrations. This is because part of the comparison used `reflect.DeepEqual` on []string objects, which returns false when one object is `[]string{}` and the other is `[]string{}(nil)`. Unforunately, this was always the case, and every Connect sidecar service would be re-registered on every iteration, which happens every 30 seconds.	2020-11-11 18:01:17 -06:00
Chris Baker	611502dd5f	Merge pull request #9327 from hashicorp/b-fix-rec-cli recs cli: fixed bad composition	2020-11-11 15:54:14 -06:00
Nick Ethier	5e1634eda1	structs: canonicalize allocatedtaskresources to populate shared ports (#9309 )	2020-11-11 16:21:47 -05:00
Chris Baker	bc82312eb4	recs cli: fixed bad composition	2020-11-11 20:52:57 +00:00
Tim Gross	60874ebe25	csi: Postrun hook should not change mode (#9323 ) The unpublish workflow requires that we know the mode (RW vs RO) if we want to unpublish the node. Update the hook and the Unpublish RPC so that we mark the claim for release in a new state but leave the mode alone. This fixes a bug where RO claims were failing node unpublish. The core job GC doesn't know the mode, but we don't need it for that workflow, so add a mode specifically for GC; the volumewatcher uses this as a sentinel to check whether claims (with their specific RW vs RO modes) need to be claimed.	2020-11-11 13:06:30 -05:00
Chris Baker	2deb77dcf5	cli: updated recommendation commands and test to remove duplication of autocompletion code	2020-11-11 11:39:26 +00:00
Chris Baker	e3c0ea654d	auto-complete for recommendations CLI, plus OSS components of recommendations prefix search	2020-11-11 11:13:43 +00:00
Mahmood Ali	f61d284fe6	Custom message when job file is HCL2 incompatible Use a custom message when the job file is a valid HCL1 but no longer valid under HCL 2 syntax.	2020-11-10 14:56:20 -05:00
Mahmood Ali	69849a42a5	Merge pull request #9298 from hashicorp/f-hcl2-localsvars HCL2: Variables and Locals	2020-11-09 16:44:37 -05:00
Luiz Aoqui	ea81ac5d3d	Merge pull request #9296 from hashicorp/b-remove-namespace-from-scale-request Remove Namespace field from JobScaleRequest	2020-11-09 15:13:33 -05:00
Mahmood Ali	7fcda0f01f	Start accepting input variable files as CLI arguments	2020-11-09 15:02:58 -05:00
Mahmood Ali	1ae3e8a8eb	Start using the new jobspec2 API	2020-11-09 15:01:31 -05:00
Luiz Aoqui	c536286c7a	remove Namespace field from JobScaleRequest	2020-11-09 13:02:05 -05:00
James Rasell	473fb4bb17	Merge pull request #9272 from hashicorp/f-recommendation-cli cli: add recommendation commands.	2020-11-06 16:29:57 +01:00
James Rasell	fe92d8b3cb	cli: add recommendation commands. Adds new CLI commands for applying, dismissing and detailing Nomad recommendations under a new top level recommendation command.	2020-11-06 11:16:24 +01:00
Nick Ethier	04f5c4ee5f	ar/groupservice: remove drivernetwork (#9233 ) * ar/groupservice: remove drivernetwork * consul: allow host address_mode to accept raw port numbers * consul: fix logic for blank address	2020-11-05 15:00:22 -05:00
Charlie Voiselle	0e373f70c1	Merge pull request #9255 from hashicorp/d-missing-example-comma Add missing comma in help-text example.	2020-11-04 09:51:09 -05:00
James Rasell	b147ec1e57	cli: update scaling policy list help to include job flag.	2020-11-04 13:35:29 +01:00
Charlie Voiselle	443a93be11	Add missing comma in help-text example. @krishicks spotted this while playing with command.	2020-11-02 18:00:53 -05:00
Kris Hicks	48a260fc33	Update monitor func not to take a prefix (#9251 ) The only user of monitor(evalID, true) was command/eval_status, and eval_status had a duplicate of the prefix-handling code inside it, so in all cases the complete evalID was being passed to monitor. Given that, we can remove the prefix code from command/monitor, and remove the boolean arg.	2020-11-02 10:24:49 -08:00
Kris Hicks	1da9e7fc67	Add event sink API and CLI commands (#9226 ) Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>	2020-11-02 09:57:35 -08:00
Chris Baker	719077a26d	added new policy capabilities for recommendations API state store: call-out to generic update of job recommendations from job update method recommendations API work, and http endpoint errors for OSS support for scaling polices in task block of job spec add query filters for ScalingPolicy list endpoint command: nomad scaling policy list: added -job and -type	2020-10-28 14:32:16 +00:00
Drew Bailey	86080e25a9	Send events to EventSinks (#9171 ) * Process to send events to configured sinks This PR adds a SinkManager to a server which is responsible for managing managed sinks. Managed sinks subscribe to the event broker and send events to a sink writer (webhook). When changes to the eventstore are made the sinkmanager and managed sink are responsible for reloading or starting a new managed sink. * periodically check in sink progress to raft Save progress on the last successfully sent index to raft. This allows a managed sink to resume close to where it left off in the event of a lost server or leadership change dereference eventsink so we can accurately use the watchch When using a pointer to eventsink struct it was updated immediately and our reload logic would not trigger	2020-10-26 17:27:54 -04:00
Drew Bailey	1ae39a9ed9	event sink crud operation api (#9155 ) * network sink rpc/api plumbing state store methods and restore upsert sink test get sink delete sink event sink list and tests go generate new msg types validate sink on upsert * go generate	2020-10-23 14:23:00 -04:00
Michael Schurter	c2dd9bc996	core: open source namespaces	2020-10-22 15:26:32 -07:00
Mahmood Ali	059e87c862	Merge pull request #9142 from hashicorp/f-hclv2-2.3 Support HCLv2 for Nomad jobs	2020-10-22 12:26:28 -05:00
Drew Bailey	f3dcefe5a9	remove event durability (#9147 ) * remove event durability temporarily removing go-memdb event durability until a new strategy is developed on how to best handled increased durability needs * drop events table schema and state store methods * fix neweventbuffer invocations	2020-10-22 12:21:03 -04:00
Mahmood Ali	d3a17b5c82	address review feedback	2020-10-22 11:49:37 -04:00
Mahmood Ali	f52bda4c30	api: update /render api to parse hclv2	2020-10-21 15:46:57 -04:00
Mahmood Ali	84ec0b38e8	cli: use HCLv2 parser Also, fallback to using HCLv1.	2020-10-21 15:46:57 -04:00
Mahmood Ali	1ae924973e	hclv1: tweak HCLv1 tests This ensures that gatway ReadOnly key is tested. Also, update the hclv1 test-fixtures to be hclv1 compliant.	2020-10-21 14:05:46 -04:00
Drew Bailey	6c788fdccd	Events/msgtype cleanup (#9117 ) * use msgtype in upsert node adds message type to signature for upsert node, update tests, remove placeholder method * UpsertAllocs msg type test setup * use upsertallocs with msg type in signature update test usage of delete node delete placeholder msgtype method * add msgtype to upsert evals signature, update test call sites with test setup msg type handle snapshot upsert eval outside of FSM and ignore eval event remove placeholder upsertevalsmsgtype handle job plan rpc and prevent event creation for plan msgtype cleanup upsertnodeevents updatenodedrain msgtype msg type 0 is a node registration event, so set the default to the ignore type * fix named import * fix signature ordering on upsertnode to match	2020-10-19 09:30:15 -04:00
Drew Bailey	fba0d6dc6a	event buffer size and durable count must be non negative	2020-10-15 16:34:33 -04:00
Nick Ethier	4903e5b114	Consul with CNI and host_network addresses (#9095 ) * consul: advertise cni and multi host interface addresses * structs: add service/check address_mode validation * ar/groupservices: fetch networkstatus at hook runtime * ar/groupservice: nil check network status getter before calling * consul: comment network status can be nil	2020-10-15 15:32:21 -04:00
Michael Schurter	ea55c497b7	Merge pull request #9094 from hashicorp/f-1.0 s/0.13/1.0/g	2020-10-15 08:53:33 -07:00
James Rasell	42a6e7140f	Merge pull request #9083 from hashicorp/b-fix-enterprise-config-merge agent: fix enterprise config overlay merging.	2020-10-15 08:40:49 +02:00
Michael Schurter	9c3972937b	s/0.13/1.0/g 1.0 here we come!	2020-10-14 15:17:47 -07:00
Michael Schurter	dd09fa1a4a	Merge pull request #9055 from hashicorp/f-9017-resources api: add field filters to /v1/{allocations,nodes}	2020-10-14 14:49:39 -07:00
Michael Schurter	6890cffd7a	unify boolean parameter parsing	2020-10-14 12:23:25 -07:00
Dave May	f37e90be18	Metrics gotemplate support, debug bundle features (#9067 ) * add goroutine text profiles to nomad operator debug * add server-id=all to nomad operator debug * fix bug from changing metrics from string to []byte * Add function to return MetricsSummary struct, metrics gotemplate support * fix bug resolving 'server-id=all' when no servers are available * add url to operator_debug tests * removed test section which is used for future operator_debug.go changes * separate metrics from operator, use only structs from go-metrics * ensure parent directories are created as needed * add suggested comments for text debug pprof * move check down to where it is used * add WaitForFiles helper function to wait for multiple files to exist * compact metrics check Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com> * fix github's silly apply suggestion Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>	2020-10-14 15:16:10 -04:00
Drew Bailey	c463479848	filter on additional filter keys, remove switch statement duplication properly wire up durable event count move newline responsibility moves newline creation from NDJson to the http handler, json stream only encodes and sends now ignore snapshot restore if broker is disabled enable dev mode to access event steam without acl use mapping instead of switch use pointers for config sizes, remove unused ttl, simplify closed conn logic	2020-10-14 14:14:33 -04:00
Michael Schurter	8ccbd92cb6	api: add field filters to /v1/{allocations,nodes} Fixes #9017 The ?resources=true query parameter includes resources in the object stub listings. Specifically: - For `/v1/nodes?resources=true` both the `NodeResources` and `ReservedResources` field are included. - For `/v1/allocations?resources=true` the `AllocatedResources` field is included. The ?task_states=false query parameter removes TaskStates from /v1/allocations responses. (By default TaskStates are included.)	2020-10-14 10:35:22 -07:00
Drew Bailey	684807bddb	namespace filtering	2020-10-14 12:44:43 -04:00
Drew Bailey	df96b89958	Add EvictCallbackFn to handle removing entries from go-memdb when they are removed from the event buffer. Wire up event buffer size config, use pointers for structs.Events instead of copying.	2020-10-14 12:44:42 -04:00
Drew Bailey	315f77a301	rehydrate event publisher on snapshot restore address pr feedback	2020-10-14 12:44:41 -04:00
Drew Bailey	d793529d61	event durability count and cfg	2020-10-14 12:44:40 -04:00
Drew Bailey	b4c135358d	use Events to wrap index and events, store in events table	2020-10-14 12:44:39 -04:00
Drew Bailey	559517455a	wire up enable_event_publisher	2020-10-14 12:44:38 -04:00
Drew Bailey	4793bb4e01	Events/deployment events (#9004 ) * Node Drain events and Node Events (#8980) Deployment status updates handle deployment status updates (paused, failed, resume) deployment alloc health generate events from apply plan result txn err check, slim down deployment event one ndjson line per index * consolidate down to node event + type * fix UpdateDeploymentAllocHealth test invocations * fix test	2020-10-14 12:44:37 -04:00
Drew Bailey	a4a2975edf	Event Stream API/RPC (#8947 ) This Commit adds an /v1/events/stream endpoint to stream events from. The stream framer has been updated to include a SendFull method which does not fragment the data between multiple frames. This essentially treats the stream framer as a envelope to adhere to the stream framer interface in the UI. If the `encode` query parameter is omitted events will be streamed as newline delimted JSON.	2020-10-14 12:44:36 -04:00
James Rasell	e0734bed77	agent: fix enterprise config overlay merging.	2020-10-14 09:35:16 +02:00
Chris Baker	1d35578bed	removed backwards-compatible/untagged metrics deprecated in 0.7	2020-10-13 20:18:39 +00:00
Seth Hoenig	ed13e5723f	consul/connect: dynamically select envoy sidecar at runtime As newer versions of Consul are released, the minimum version of Envoy it supports as a sidecar proxy also gets bumped. Starting with the upcoming Consul v1.9.X series, Envoy v1.11.X will no longer be supported. Current versions of Nomad hardcode a version of Envoy v1.11.2 to be used as the default implementation of Connect sidecar proxy. This PR introduces a change such that each Nomad Client will query its local Consul for a list of Envoy proxies that it supports (https://github.com/hashicorp/consul/pull/8545) and then launch the Connect sidecar proxy task using the latest supported version of Envoy. If the `SupportedProxies` API component is not available from Consul, Nomad will fallback to the old version of Envoy supported by old versions of Consul. Setting the meta configuration option `meta.connect.sidecar_image` or setting the `connect.sidecar_task` stanza will take precedence as is the current behavior for sidecar proxies. Setting the meta configuration option `meta.connect.gateway_image` will take precedence as is the current behavior for connect gateways. `meta.connect.sidecar_image` and `meta.connect.gateway_image` may make use of the special `${NOMAD_envoy_version}` variable interpolation, which resolves to the newest version of Envoy supported by the Consul agent. Addresses #8585 #7665	2020-10-13 09:14:12 -05:00
Yoan Blanc	891accb89a	use allow/deny instead of the colored alternatives (#9019 ) Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-10-12 08:47:05 -04:00
Tim Gross	ecec432653	csi: allow for volume detach to work with gc'd nodes (#9057 ) When we try to prefix match the `nomad volume detach` node ID argument, the node may have been already GC'd. The volume unpublish workflow gracefully handles this case so that we can free the claim. So make a best effort to find a node ID among the volume's claimed allocations, or otherwise just use the node ID we've been given by the user as-is.	2020-10-09 09:45:03 -04:00
Tim Gross	82749bd6a6	csi: allow volume detach to take a node ID prefix (#9041 ) Fixes a bug where the `nomad volume detach` command would not accept a node ID prefix instead of a full node ID. The volume ID is already prefix matched server-side.	2020-10-07 11:14:57 -04:00
James Rasell	d2fe895216	Merge pull request #9023 from hashicorp/f-gh-8648 cli: add scale and scaling-events commands to job cmd.	2020-10-06 18:03:41 +02:00
Dave May	561637c063	Merge pull request #9034 from hashicorp/dmay-debug-metrics Add metrics command / output to debug bundle	2020-10-06 11:47:09 -04:00
Seth Hoenig	6cffbecb3a	Merge pull request #9033 from pierreca/verify-remove-checks Do not double-remove checks removed by Consul	2020-10-06 10:16:13 -05:00
davemay99	18aa30c00f	metrics return bytes instead of string for more flexibility	2020-10-06 10:49:15 -04:00
davemay99	19a075cf47	update deprecated syntax per GH-9027	2020-10-06 09:47:16 -04:00
davemay99	7160c26f04	sync vendored modules	2020-10-06 09:16:52 -04:00
James Rasell	b7dac9020f	Merge pull request #9025 from hashicorp/f-gh-8649 cli: add policy list and info to new scaling cmd.	2020-10-06 12:40:43 +02:00
James Rasell	552d1b2ed4	cli: ensure scaling policy target doesn't have trailing comma	2020-10-06 12:18:17 +02:00
James Rasell	564adc1678	cli: add scale and scaling-events commands to job cmd. This adds the ability to scale Nomad jobs and view scaling events via the CLI.	2020-10-06 09:58:46 +02:00
James Rasell	ffe6533ad1	Merge pull request #9027 from hashicorp/f-gh-9026 cli: move tests to use NewMockUi func.	2020-10-06 08:28:18 +02:00
davemay99	603cc1776c	Add metrics command / output to debug bundle	2020-10-05 22:30:01 -04:00
Pierre Cauchois	1efe05f516	Do not double-remove checks removed by Consul When deregistering a service, consul also deregisters the associated checks. The current state keeps track of all services and all checks separately and deregisters them in sequence, which leads, whether during syncs or shutdowns, to check deregistrations happening twice and failing the second time (generating errors in logs) This fix includes: - a fix to the sync logic that just pulls the checks after the services have been synced - a fix to the shutdown mechanism that gets an updated list of checks after deregistering the services, so that we get a cleaner check deregistration process.	2020-10-06 00:30:29 +00:00
Chris Baker	7f701fddd0	updated docs and validation to further prohibit null chars in region, datacenter, and job name	2020-10-05 18:01:50 +00:00
James Rasell	2ed78b8a7e	cli: move tests to use NewMockUi func.	2020-10-05 16:07:41 +02:00
James Rasell	b8727997cd	cli: add policy list and info to new scaling cmd. This adds the ability to detail scaling policies via the CLI.	2020-10-05 15:18:30 +02:00
Kent 'picat' Gruber	5e1c716835	Merge pull request #8998 from hashicorp/keygen-32-bytes Use 32-byte key for gossip encryption to enable AES-256	2020-10-02 17:17:55 -04:00
Kent 'picat' Gruber	b03f79700c	Fix panic in test due to the agent's logger not being initialized yet So a null logger is used to avoid the problem.	2020-10-02 11:10:27 -04:00
Fredrik Hoem Grelland	953d4de8dd	update consul-template to v0.25.1 (#8988 )	2020-10-01 14:08:49 -04:00
Kent 'picat' Gruber	90e85f9add	Fix other usages of initKeyring func to use logger as third argument	2020-10-01 11:13:06 -04:00
Kent 'picat' Gruber	b98bb99dfe	Log AES-128 and AES-192 key sizes during keyring initialization	2020-10-01 11:12:14 -04:00
Kent 'picat' Gruber	42bdb03f43	Fix operator keygen test to check for 32 bytes	2020-09-30 17:04:33 -04:00
Kent 'picat' Gruber	6cefe03359	Generate 32-byte gossip key for nomad operator keygen command The key generated from this command is used for gossip encrpytion, which utilizes AES GCM encryption. Using a key size of 16-bytes enables AES-128 while a key size of 32 bytes enables AES-256. The underlying memberlist library supports the larger key size, and is ultimatley preferable from a security standpoint. Consul also uses 32 bytes by default: `1a14b94441`	2020-09-30 17:02:37 -04:00
Michael Schurter	765473e8b0	jobspec: lower min cpu resources from 10->1 Since CPU resources are usually a soft limit it is desirable to allow setting it as low as possible to allow tasks to run only in "idle" time. Setting it to 0 is still not allowed to avoid potential unintentional side effects with allowing a zero value. While there may not be any side effects this commit attempts to minimize risk by avoiding the issue. This does not change the defaults.	2020-09-30 12:15:13 -07:00
Dave May	eaa4f6faf5	Merge pull request #8922 from hashicorp/dmay-raftutil-path Raftutil cleanup, plus helper function to find raft.db	2020-09-29 15:12:32 -04:00
Tim Gross	b12938a9fb	command: fix a typo in the help text for namespaces (#8975 )	2020-09-28 12:23:25 -04:00
davemay99	f2b3536da2	refactor functions to find raft.db	2020-09-24 19:00:53 -04:00
Nick Ethier	e75a3f349b	command: remove mbits from quota hcl (#8740 )	2020-09-24 11:44:59 -04:00
davemay99	bc9fb2a6ee	remove extra debug output	2020-09-17 21:42:53 -04:00
davemay99	5a159f1108	Raftutil cleanup, plus helper function to find raft.db	2020-09-17 21:35:17 -04:00
Mahmood Ali	87b0437e0f	Merge pull request #8911 from hashicorp/f-task_network_warning-smaller Smaller 0.12 mbit deprecation PR	2020-09-17 08:11:13 -05:00
Tim Gross	7a691d0000	filter volumes by type in 'nomad node status' output (#8902 ) Volume requests can be either CSI or host volumes, so when displaying the CSI volume info for `nomad node status -verbose` we need to filter out the host volumes.	2020-09-16 15:00:12 -04:00
Mahmood Ali	d65cda5e70	Update job examples with MBit deprecation	2020-09-16 11:06:19 -04:00
Charlie Voiselle	5ec3945531	Change tabs to spaces in nomad monitor help text	2020-09-14 15:08:30 -04:00
Michael Schurter	1544341f09	Merge pull request #8862 from hashicorp/release-0.12.4 Prepare for 0.13 development cycle	2020-09-10 09:14:44 -07:00
Mahmood Ali	d4f385d6e1	Upgrade to golang 1.15 (#8858 ) Upgrade to golang 1.15 Starting with golang 1.5, setting Ctty value result in `Setctty set but Ctty not valid in child` error, as part of https://github.com/golang/go/issues/29458 . This commit lifts the fix in https://github.com/creack/pty/pull/97 .	2020-09-09 15:59:29 -04:00
Nomad Release bot	3b8a2f22dc	Generate files for 0.12.4-rc1 release	2020-09-03 02:59:23 +00:00
Drew Bailey	33dc50dca0	Merge pull request #8793 from hashicorp/debug-cli/run-intervals run commands for duration and interval without needing to specify server or node	2020-08-31 16:07:26 -04:00
Drew Bailey	6d7a6ebb38	run commands for duration and interval without needing to specify servers or nodes	2020-08-31 14:13:03 -04:00
Lang Martin	b4d364f030	command/plugin_status_csi: plugin status :id keeps expected count	2020-08-31 13:56:54 -04:00
Drew Bailey	1f7ea53876	add license info to operator debug command	2020-08-31 13:22:23 -04:00
Mahmood Ali	66df214792	raft debug commands are low-level internal commands	2020-08-31 08:45:59 -04:00
Mahmood Ali	12dbf699fa	Apply suggestions from code review Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2020-08-31 08:45:59 -04:00
Mahmood Ali	d588f91575	add helper commands for debugging state	2020-08-31 08:45:59 -04:00
Jeffrey 'jf' Lim	25071c525a	Fix cmd.Name() for NodeEligibilityCommand	2020-08-29 22:45:42 +08:00
Tim Gross	b77fe023b5	MRD: move 'job stop -global' handling into RPC (#8776 ) The initial implementation of global job stop for MRD looped over all the regions in the CLI for expedience. This changeset includes the OSS parts of moving this into the RPC layer so that API consumers don't have to implement this logic themselves.	2020-08-28 14:28:13 -04:00
Lang Martin	97c7f2acea	command/operator_debug: mkdir before storing agent-host (#8707 ) The api calls were reordered, the new order omits the `agent-host.json` result by fetching it before the directory is created.	2020-08-28 11:58:06 -04:00
Lang Martin	7d483f93c0	csi: plugins track jobs in addition to allocations, and use job information to set expected counts (#8699 ) * nomad/structs/csi: add explicit job support * nomad/state/state_store: capture job updates directly * api/nodes: CSIInfo needs the AllocID * command/agent/csi_endpoint: AllocID was missing Co-authored-by: Tim Gross <tgross@hashicorp.com>	2020-08-27 17:20:00 -04:00
Seth Hoenig	9f1f2a5673	Merge branch 'master' into f-cc-ingress	2020-08-26 15:31:05 -05:00
Seth Hoenig	dfe179abc5	consul/connect: fixup some comments and context timeout	2020-08-26 13:17:16 -05:00
Tim Gross	f9b6c8153c	csi: fix panic in serializing nil allocs in volume API (#8735 ) - fix panic in serializing nil allocs in volume API - prevent potential panic in serializing plugin allocs	2020-08-25 10:13:05 -04:00
Seth Hoenig	26e77623e5	consul/connect: fixup tests to use new consul sdk	2020-08-24 12:02:41 -05:00
Seth Hoenig	c4fa644315	consul/connect: remove envoy dns option from gateway proxy config	2020-08-24 09:11:55 -05:00
Yoan Blanc	327d17e0dc	fixup! vendor: consul/api, consul/sdk v1.6.0 Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-08-24 08:59:03 +02:00
Seth Hoenig	5b072029f2	consul/connect: add initial support for ingress gateways This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza. ```hcl service { connect { gateway { proxy { // envoy proxy configuration } ingress { // ingress-gateway configuration entry } } } } ``` A gateway can be run in `bridge` or `host` networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value. Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver. Aims to address #8294 and tangentially #8647	2020-08-21 16:21:54 -05:00
Nick Ethier	3cd5f46613	Update UI to use new allocated ports fields (#8631 ) * nomad: canonicalize alloc shared resources to populate ports * ui: network ports * ui: remove unused task network references and update tests with new shared ports model * ui: lint * ui: revert auto formatting * ui: remove unused page objects * structs: remove unrelated test from bad conflict resolution * ui: formatting	2020-08-20 11:07:13 -04:00
Tim Gross	22e77bb03c	mrd: remove redundant validation in HTTP endpoint (#8685 ) The `regionForJob` function in the HTTP job endpoint overrides the region for multiregion jobs to `global`, which is used as a sentinel value in the server's job endpoint to avoid re-registration loops. This changeset removes an extraneous check that results in errors in the web UI and makes round-tripping through the HTTP API cumbersome for all consumers.	2020-08-18 16:48:09 -04:00
Tim Gross	38ec70eb8d	multiregion: validation should always return error for OSS (#8687 )	2020-08-18 15:35:38 -04:00
Lang Martin	6d8165c410	command/agent/csi_endpoint: explicit allocations (#8669 )	2020-08-13 15:48:08 -04:00
Tim Gross	7dca72acbe	csi: fix panic from assignment to nil map in plugin API (#8666 )	2020-08-13 11:36:41 -04:00
Tim Gross	3faa138732	fix panic converting structs to API in CSI endpoint (#8659 )	2020-08-12 15:59:10 -04:00
Nomad Release bot	1ea9d4eb22	Generate files for 0.12.2 release	2020-08-12 00:50:49 +00:00
Lang Martin	07ea822c6a	nomad debug renamed to nomad operator debug (#8602 ) * renamed: command/debug.go -> command/operator_debug.go * website: rename debug -> operator debug * website/pages/api-docs/agent: name in api docs	2020-08-11 15:39:44 -04:00
Lang Martin	1d7998f39f	`debug` command archive content changes (#8462 ) * command/debug: print interval data so the operator knows its waiting * command/debug: use the Consul/Vault env for queries * command/debug: capture the operator endpoints * command/debug: capture API errors in the archive bundle	2020-08-11 13:14:28 -04:00
Lang Martin	c82b2a2454	CSI: volume and plugin allocations in the API (#8590 ) * command/agent/csi_endpoint: explicitly convert to API structs, and convert allocs for single object get endpoints	2020-08-11 12:24:41 -04:00
Tim Gross	443fdaa86b	csi: nomad volume detach command (#8584 ) The soundness guarantees of the CSI specification leave a little to be desired in our ability to provide a 100% reliable automated solution for managing volumes. This changeset provides a new command to bridge this gap by providing the operator the ability to intervene. The command doesn't take an allocation ID so that the operator doesn't have to keep track of alloc IDs that may have been GC'd. Handle this case in the unpublish RPC by sending the client RPC for all the terminal/nil allocs on the selected node.	2020-08-11 10:18:54 -04:00
Seth Hoenig	fd4804bf26	consul: able to set pass/fail thresholds on consul service checks This change adds the ability to set the fields `success_before_passing` and `failures_before_critical` on Consul service check definitions. This is a feature added to Consul v1.7.0 and later. https://www.consul.io/docs/agent/checks#success-failures-before-passing-critical Nomad doesn't do much besides pass the fields through to Consul. Fixes #6913	2020-08-10 14:08:09 -05:00
Drew Bailey	c06a84e4a2	ignore VAULT_NAMESPACE (#8581 ) VAULT_NAMESPACE in 0.12.1 and previous versions is already ignored. \n revert change that used it as a default since it will break oss users	2020-07-31 10:33:21 -04:00
Drew Bailey	b296558b8e	oss compoments for multi-vault namespaces adds in oss components to support enterprise multi-vault namespace feature upgrade specific doc on vault multi-namespaces vault docs update test to reflect new error	2020-07-24 10:14:59 -04:00
Mahmood Ali	b800a4f80e	Merge pull request #8514 from sashaaKr/bugfix/cli_ui change url to client	2020-07-24 09:54:39 -04:00
James Rasell	95db43eaf0	Merge pull request #8491 from hashicorp/b-gh-8481 api: task groups in system jobs do not support scaling stanzas.	2020-07-24 14:20:26 +02:00
Tim Gross	43d2052c99	csi: avoid panic in CLI for failed node attachment cleanup (#8525 ) If the node API returns an attached volume that doesn't belong to an alloc (because it's failed to clean up properly), `nomad node status` will panic when rendering the response. Also, avoid empty volumes output in node status	2020-07-24 08:17:27 -04:00
sasha	f09a65227d	remove test file	2020-07-23 18:44:10 +03:00
sasha	2980010e63	change url to client	2020-07-23 18:41:38 +03:00
Nomad Release bot	f2f50bf48e	Generate files for 0.12.1 release	2020-07-23 13:17:59 +00:00
Mahmood Ali	fc38cd21c4	mrd: only output evalID if found If the multi-region job is a periodic/dispatch job, stopping them returns an empty EvalID. This removes some unexpected empty lines.	2020-07-22 16:43:03 -04:00
Lars Lehtonen	e26ea30b7e	command/agent: fix dropped test error (#8504 )	2020-07-22 15:06:35 -04:00
Drew Bailey	744cf9b2e8	remove duplicate license info (#8496 )	2020-07-22 10:21:56 -04:00
James Rasell	2da8bd8f58	agent: task groups in system jobs do not support scaling stanzas.	2020-07-22 11:10:59 +02:00
Mahmood Ali	c29dec2ebd	format job init hcl (#8483 )	2020-07-21 11:49:02 -04:00
Mahmood Ali	a483dde8b9	minor tweaks from Ent	2020-07-20 09:25:09 -04:00
Mahmood Ali	72ac33e4e7	Refactor setupLoggers	2020-07-17 11:05:57 -04:00
Mahmood Ali	ad2d484974	Set AgentShutdown	2020-07-17 11:04:57 -04:00
Mahmood Ali	ec9a12e54e	Fix pro tags	2020-07-17 11:02:00 -04:00
Tim Gross	bd457343de	MRD: all regions should start pending (#8433 ) Deployments should wait until kicked off by `Job.Register` so that we can assert that all regions have a scheduled deployment before starting any region. This changeset includes the OSS fixes to support the ENT work. `IsMultiregionStarter` has no more callers in OSS, so remove it here.	2020-07-14 10:57:37 -04:00
Chris Baker	f8478b6f82	Merge branch 'master' of github.com:hashicorp/nomad into release-0.12.0	2020-07-08 21:16:31 +00:00
Nick Ethier	119ece09a0	docs: add CNI and host_network docs (#8391 ) Co-authored-by: Seth Hoenig <shoenig@hashicorp.com>	2020-07-08 15:45:04 -04:00
Tim Gross	1098ca6ef1	fix multiregion plan output flags (#8375 ) The call to render the output diff swapped the `diff` and `verbose` bool parameters, resulting in dropping the diff output in multi-region plans but not single-region plans.	2020-07-08 10:10:08 -04:00
Nomad Release bot	549e766eab	Generate files for 0.12.0-rc1 release	2020-07-07 03:17:05 +00:00
Nick Ethier	e0fb634309	ar: support opting into binding host ports to default network IP (#8321 ) * ar: support opting into binding host ports to default network IP * fix config plumbing * plumb node address into network resource * struct: only handle network resource upgrade path once	2020-07-06 18:51:46 -04:00
Tim Gross	18250f71fd	fix region flag vs job region handling in plan/submit (#8347 )	2020-07-06 15:46:09 -04:00
Chris Baker	9100b6b7c0	changes to make sure that Max is present and valid, to improve error messages * made api.Scaling.Max a pointer, so we can detect (and complain) when it is neglected * added checks to HCL parsing that it is present * when Scaling.Max is absent/invalid, don't return extraneous error messages during validation * tweak to multiregion handling to ensure that the count is valid on the interpolated regional jobs resolves #8355	2020-07-04 19:05:50 +00:00

... 2 3 4 5 6 ...

3024 commits