open-nomad

Author	SHA1	Message	Date
Tim Gross	1098ca6ef1	fix multiregion plan output flags (#8375 ) The call to render the output diff swapped the `diff` and `verbose` bool parameters, resulting in dropping the diff output in multi-region plans but not single-region plans.	2020-07-08 10:10:08 -04:00
Nomad Release bot	549e766eab	Generate files for 0.12.0-rc1 release	2020-07-07 03:17:05 +00:00
Nick Ethier	e0fb634309	ar: support opting into binding host ports to default network IP (#8321 ) * ar: support opting into binding host ports to default network IP * fix config plumbing * plumb node address into network resource * struct: only handle network resource upgrade path once	2020-07-06 18:51:46 -04:00
Tim Gross	18250f71fd	fix region flag vs job region handling in plan/submit (#8347 )	2020-07-06 15:46:09 -04:00
Chris Baker	9100b6b7c0	changes to make sure that Max is present and valid, to improve error messages * made api.Scaling.Max a pointer, so we can detect (and complain) when it is neglected * added checks to HCL parsing that it is present * when Scaling.Max is absent/invalid, don't return extraneous error messages during validation * tweak to multiregion handling to ensure that the count is valid on the interpolated regional jobs resolves #8355	2020-07-04 19:05:50 +00:00
Mahmood Ali	329969b97e	tests: make testagent shutdown idempotent Avoid double freeing ports if an agent.Shutdown() is called multiple times.	2020-07-03 09:16:01 -04:00
Lang Martin	1e7560d621	command/debug: use the correct env vars for Consul token (#8332 )	2020-07-02 10:04:22 -04:00
Lang Martin	6c22cd587d	api: `nomad debug` new /agent/host (#8325 ) * command/agent/host: collect host data, multi platform * nomad/structs/structs: new HostDataRequest/Response * client/agent_endpoint: add RPC endpoint * command/agent/agent_endpoint: add Host * api/agent: add the Host endpoint * nomad/client_agent_endpoint: add Agent Host with forwarding * nomad/client_agent_endpoint: use findClientConn This changes forwardMonitorClient and forwardProfileClient to use findClientConn, which was cribbed from the common parts of those funcs. * command/debug: call agent hosts * command/agent/host: eliminate calling external programs	2020-07-02 09:51:25 -04:00
Mahmood Ali	1917989a1f	document namespace option in CLI docs	2020-07-01 15:31:41 -04:00
Tim Gross	23be116da0	csi: add -force flag to volume deregister (#8295 ) The `nomad volume deregister` command currently returns an error if the volume has any claims, but in cases where the claims can't be dropped because of plugin errors, providing a `-force` flag gives the operator an escape hatch. If the volume has no allocations or if they are all terminal, this flag deletes the volume from the state store, immediately and implicitly dropping all claims without further CSI RPCs. Note that this will not also unmount/detach the volume, which we'll make the responsibility of a separate `nomad volume detach` command.	2020-07-01 12:17:51 -04:00
Mahmood Ali	ee6fbcbc0f	Merge pull request #8296 from hashicorp/b-tests-cleanup-20200625 Cleanup for command package tests	2020-06-26 09:31:41 -04:00
Mahmood Ali	30492e8119	tests: avoid using os.Setenv for tokens	2020-06-26 08:52:21 -04:00
Mahmood Ali	9583190eb3	tests: use flagAddress instead of process env Using Setenv may can cause test interference, where a test may accidentally pick up value set by another test.	2020-06-26 08:52:21 -04:00
Mahmood Ali	384d8cf3a5	Merge pull request #8271 from hashicorp/f-comment-init-check-stanza Comment out default Consul check; Update URLs	2020-06-26 08:30:30 -04:00
Nick Ethier	89118016fc	command: correctly show host IP in ports output /w multi-host networks (#8289 )	2020-06-25 15:16:01 -04:00
Lang Martin	9b657b5e5e	new command: nomad debug captures a debug archive of cluster state (#8244 ) * command/debug: build a local archive of debug data * command/debug: query consul and vault directly * command/debug: include pprof CPUProfile Trace and goroutine * command/debug: trap signals and close the monitor requests	2020-06-25 12:51:23 -04:00
Mahmood Ali	8631e9dad5	always shutdown test server on test cleanup	2020-06-25 12:44:19 -04:00
Tim Gross	e52f76ed53	update compiled static assets	2020-06-24 16:37:13 -04:00
Charlie Voiselle	e0e3a66b3a	Fix link to scheduler page	2020-06-24 15:44:07 -04:00
Charlie Voiselle	9b20269709	Comment out default Consul check; Update URLs Having an active check in the sample job causes issues with testing deployments in environments that are not integrated with Consul. This negatively impacts some of the getting-started experiences. Commenting out the check allows deployments to proceed successfully but leaves it in the sample job for convenience. Made a drive-by fix to all of the URLs in the jobfile	2020-06-24 15:34:48 -04:00
Tim Gross	67ffcb35e9	multiregion: add support for 'job plan' (#8266 ) Add a scatter-gather for multiregion job plans. Each region's servers interpolate the plan locally in `Job.Plan` but don't distribute the plan as done in `Job.Run`. Note that it's not possible to return a usable modify index from a multiregion plan for use with `-check-index`. Even if we were to force the modify index to be the same at the start of `Job.Run` the index immediately drifts during each region's deployments, depending on events local to each region. So we omit this section of a multiregion plan.	2020-06-24 13:24:55 -04:00
Tim Gross	a449009e9f	multiregion validation fixes (#8265 ) Multi-region jobs need to bypass validating counts otherwise we get spurious warnings in Job.Plan.	2020-06-24 12:18:51 -04:00
Seth Hoenig	3872b493e5	Merge pull request #8011 from hashicorp/f-cnative-host consul/connect: implement initial support for connect native	2020-06-24 10:33:12 -05:00
Seth Hoenig	e79b79034d	connect/native: fixup command/agent/consul/connect test cases	2020-06-24 09:05:56 -05:00
Tim Gross	010d94d419	multiregion: job stop across regions with -global flag (#8258 ) Adds a `-global` flag for stopping multiregion jobs in all regions at once. Warn the user if they attempt to stop a multiregion job in a single region.	2020-06-23 15:56:04 -04:00
James Rasell	bc40665f1d	cli: fix license get command help Synopsis text.	2020-06-23 18:47:39 +02:00
Seth Hoenig	6c5ab7f45e	consul/connect: split connect native flag and task in service	2020-06-23 10:22:22 -05:00
Seth Hoenig	4d71f22a11	consul/connect: add support for running connect native tasks This PR adds the capability of running Connect Native Tasks on Nomad, particularly when TLS and ACLs are enabled on Consul. The `connect` stanza now includes a `native` parameter, which can be set to the name of task that backs the Connect Native Consul service. There is a new Client configuration parameter for the `consul` stanza called `share_ssl`. Like `allow_unauthenticated` the default value is true, but recommended to be disabled in production environments. When enabled, the Nomad Client's Consul TLS information is shared with Connect Native tasks through the normal Consul environment variables. This does NOT include auth or token information. If Consul ACLs are enabled, Service Identity Tokens are automatically and injected into the Connect Native task through the CONSUL_HTTP_TOKEN environment variable. Any of the automatically set environment variables can be overridden by the Connect Native task using the `env` stanza. Fixes #6083	2020-06-22 14:07:44 -05:00
Mahmood Ali	fa4e898c45	accomodate enterprise specific commands `nomad operator snapshot agent` is an Enterprise specific command	2020-06-22 10:27:25 -04:00
Michael Schurter	562704124d	Merge pull request #8208 from hashicorp/f-multi-network multi-interface network support	2020-06-19 15:46:48 -07:00
Mahmood Ali	bf08b7a890	Merge pull request #8214 from hashicorp/docs-snapshot-update Update changelog and snapshot docs	2020-06-19 14:27:12 -04:00
Mahmood Ali	d04ab67045	Apply suggestions from code review Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>	2020-06-19 13:36:22 -04:00
Mahmood Ali	ef6507d6ee	cli: use <file> for consistency	2020-06-19 12:19:38 -04:00
Mahmood Ali	ce0eee6a78	complete missed message	2020-06-19 11:02:36 -04:00
Mahmood Ali	963b1251ff	Merge pull request #8082 from hashicorp/f-raft-multipler Implement raft multipler flag	2020-06-19 10:04:59 -04:00
Nick Ethier	f0559a8162	multi-interface network support	2020-06-19 09:42:10 -04:00
Mahmood Ali	38a01c050e	Merge pull request #8192 from hashicorp/f-status-allnamespaces-2 CLI Allow querying all namespaces for jobs and allocations - Try 2	2020-06-18 20:16:52 -04:00
Nick Ethier	0bc0403cc3	Task DNS Options (#7661 ) Co-Authored-By: Tim Gross <tgross@hashicorp.com> Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>	2020-06-18 11:01:31 -07:00
Mahmood Ali	5c623f33d5	cli: warn on multiple prefix matches when querying all namespaces	2020-06-17 16:32:51 -04:00
Mahmood Ali	8d9ce41202	cli: query all namespaces for alloc subcommands	2020-06-17 16:31:06 -04:00
Mahmood Ali	7a33a75449	cli: jobs allow querying jobs in all namespaces	2020-06-17 16:31:01 -04:00
Mahmood Ali	e784fe331a	use '*' to indicate all namespaces This reverts the introduction of AllNamespaces parameter that was merged earlier but never got released.	2020-06-17 16:27:43 -04:00
Tim Gross	7b12445f29	multiregion: change AutoRevert to OnFailure	2020-06-17 11:05:45 -04:00
Tim Gross	b09b7a2475	Multiregion job registration Integration points for multiregion jobs to be registered in the enterprise version of Nomad: * hook in `Job.Register` for enterprise to send job to peer regions * remove monitoring from `nomad job run` and `nomad job stop` for multiregion jobs	2020-06-17 11:04:58 -04:00
Tim Gross	161bcd9479	use constants from http package	2020-06-17 11:04:02 -04:00
Tim Gross	b93efc16d5	multiregion CLI: nomad deployment unblock	2020-06-17 11:03:44 -04:00
Drew Bailey	9263fcb0d3	Multiregion deploy status and job status CLI	2020-06-17 11:03:34 -04:00
Tim Gross	6851024925	Multiregion structs Initial struct definitions, jobspec parsing, validation, and conversion between Nomad structs and API structs for multi-region deployments.	2020-06-17 11:00:14 -04:00
Chris Baker	de8a46b0f8	added -preserve-counts to `job run` CLI, updated website	2020-06-16 18:45:28 +00:00
Chris Baker	377f881fbd	removed api.RegisterJobRequest in favor of api.JobRegisterRequest modified `job inspect` and `job run -output` to use anonymous struct to keep previous behavior	2020-06-16 18:45:17 +00:00
Chris Baker	1e3563e08c	wip: added PreserveCounts to struct.JobRegisterRequest, development test for Job.Register	2020-06-16 18:45:17 +00:00
James Rasell	080d521691	Merge pull request #8162 from hashicorp/b-gh-8161 cli: fix malformed alloc status address list when more than 1 addr	2020-06-16 16:35:53 +02:00
James Rasell	222987602b	cli: fix malformed alloc status address list when more than 1 addr	2020-06-15 14:35:47 +02:00
Mahmood Ali	9bfc3e28d9	Apply suggestions from code review Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2020-06-15 08:32:16 -04:00
Mahmood Ali	dda67192b6	clarify error message Co-authored-by: Tim Gross <tgross@hashicorp.com>	2020-06-09 11:26:52 -04:00
Mahmood Ali	63f6307487	tests: client already disabled	2020-06-07 16:38:11 -04:00
Mahmood Ali	69bb42acf8	tests: prefix agent logs to identify agent sources	2020-06-07 16:38:11 -04:00
Mahmood Ali	257b3600ab	implement snapshot restore CLI	2020-06-07 15:47:07 -04:00
Mahmood Ali	9eb13ae144	basic snapshot restore	2020-06-07 15:46:23 -04:00
Seth Hoenig	435c0d9fc8	deps: Switch to Go modules for dependency management This PR switches the Nomad repository from using govendor to Go modules for managing dependencies. Aspects of the Nomad workflow remain pretty much the same. The usual Makefile targets should continue to work as they always did. The API submodule simply defers to the parent Nomad version on the repository, keeping the semantics of API versioning that currently exists.	2020-06-02 14:30:36 -05:00
Mahmood Ali	de44d9641b	Merge pull request #8047 from hashicorp/f-snapshot-save API for atomic snapshot backups	2020-06-01 07:55:16 -04:00
Mahmood Ali	19cc84ec05	Apply suggestions from code review Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>	2020-05-31 21:29:17 -04:00
Mahmood Ali	a73cd01a00	Merge pull request #8001 from hashicorp/f-jobs-list-across-nses endpoint to expose all jobs across all namespaces	2020-05-31 21:28:03 -04:00
Mahmood Ali	0e8fafd739	implement raft multiplier	2020-05-31 12:24:27 -04:00
Drew Bailey	23d24c7a7f	removes pro tags (#8014 )	2020-05-28 15:40:17 -04:00
Drew Bailey	34871f89be	Oss license support for ent builds (#8054 ) * changes necessary to support oss licesning shims revert nomad fmt changes update test to work with enterprise changes update tests to work with new ent enforcements make check update cas test to use scheduler algorithm back out preemption changes add comments * remove unused method	2020-05-27 13:46:52 -04:00
Drew Bailey	5948c4f497	Revert "disable license cli commands"	2020-05-26 12:39:39 -04:00
Seth Hoenig	889e7ddd0c	build: use hashicorp hclfmt We have been using fatih/hclfmt which is long abandoned. Instead, switch to HashiCorp's own hclfmt implementation. There are some trivial changes in behavior around whitespace.	2020-05-24 18:31:57 -05:00
Mahmood Ali	08b69d3bc4	implement snapshot inspect CLI	2020-05-21 20:04:38 -04:00
Mahmood Ali	0a27559b8f	Implement snapshot save CLI	2020-05-21 20:04:38 -04:00
Mahmood Ali	2108681c1d	Endpoint for snapshotting server state	2020-05-21 20:04:38 -04:00
James Rasell	ae0fb98c6b	api: return custom error if API attempts to decode empty body.	2020-05-19 15:46:31 +02:00
Mahmood Ali	5ab2d52e27	endpoint to expose all jobs across all namespaces Allow a `/v1/jobs?all_namespaces=true` to list all jobs across all namespaces. The returned list is to contain a `Namespace` field indicating the job namespace. If ACL is enabled, the request token needs to be a management token or have `namespace:list-jobs` capability on all existing namespaces.	2020-05-18 13:50:46 -04:00
Nomad Release bot	189a378549	Generate files for 0.11.2 release	2020-05-14 20:49:42 +00:00
Mahmood Ali	9366181be6	always check `default_scheduler_config` config Also, avoid early return on validation to avoid masking some validation bugs in dev setup.	2020-05-14 14:16:12 -04:00
Lang Martin	d3c4700cd3	server: stop after client disconnect (#7939 ) * jobspec, api: add stop_after_client_disconnect * nomad/state/state_store: error message typo * structs: alloc methods to support stop_after_client_disconnect 1. a global AllocStates to track status changes with timestamps. We need this to track the time at which the alloc became lost originally. 2. ShouldClientStop() and WaitClientStop() to actually do the math * scheduler/reconcile_util: delayByStopAfterClientDisconnect * scheduler/reconcile: use delayByStopAfterClientDisconnect * scheduler/util: updateNonTerminalAllocsToLost comments This was setup to only update allocs to lost if the DesiredStatus had already been set by the scheduler. It seems like the intention was to update the status from any non-terminal state, and not all lost allocs have been marked stop or evict by now * scheduler/testing: AssertEvalStatus just use require * scheduler/generic_sched: don't create a blocked eval if delayed * scheduler/generic_sched_test: several scheduling cases	2020-05-13 16:39:04 -04:00
Tim Gross	4374c1a837	csi: support Secrets parameter in CSI RPCs (#7923 ) CSI plugins can require credentials for some publishing and unpublishing workflow RPCs. Secrets are configured at the time of volume registration, stored in the volume struct, and then passed around as an opaque map by Nomad to the plugins.	2020-05-11 17:12:51 -04:00
Drew Bailey	466e8d5043	disable license cli commands	2020-05-11 13:49:29 -04:00
Mahmood Ali	061a439f2c	Merge pull request #7912 from hashicorp/f-scheduler-algorithm-followup Scheduler Algorithm Defaults handling and docs	2020-05-11 09:30:58 -04:00
Tim Gross	3aa761b151	Periodic GC for volume claims (#7881 ) This changeset implements a periodic garbage collection of CSI volumes with missing allocations. This can happen in a scenario where a node update fails partially and the allocation updates are written to raft but the evaluations to GC the volumes are dropped. This feature will cover this edge case and ensure that upgrades from 0.11.0 and 0.11.1 get any stray claims cleaned up.	2020-05-11 08:20:50 -04:00
Mahmood Ali	2c963885b0	handle upgrade path and defaults Ensure that `""` Scheduler Algorithm gets explicitly set to binpack on upgrades or on API handling when user misses the value. The scheduler already treats `""` value as binpack. This PR merely ensures that the operator API returns the effective value.	2020-05-09 12:34:08 -04:00
Drew Bailey	fde40046a1	update license output	2020-05-07 12:14:15 -04:00
Tim Gross	801ebcfe8d	periodic GC for CSI plugins (#7878 ) This changeset implements a periodic garbage collection of unused CSI plugins. Plugins are self-cleaning when the last allocation for a plugin is stopped, but this feature will cover any missing edge cases and ensure that upgrades from 0.11.0 and 0.11.1 get any stray plugins cleaned up.	2020-05-06 16:49:12 -04:00
Drew Bailey	48c451709e	update license command output to reflect api changes	2020-05-05 10:28:58 -04:00
Mahmood Ali	78ae7b885a	Merge pull request #7810 from hashicorp/spread-configuration spread scheduling algorithm	2020-05-01 13:15:19 -04:00
Mahmood Ali	b9e3cde865	tests and some clean up	2020-05-01 13:13:30 -04:00
Charlie Voiselle	663fb677cf	Add SchedulerAlgorithm to SchedulerConfig	2020-05-01 13:13:29 -04:00
Drew Bailey	581ad558a8	temporarily test for 404 until endpoint is ready	2020-05-01 11:24:37 -04:00
Drew Bailey	41c7d49eb7	properly format license output	2020-04-30 14:46:26 -04:00
Drew Bailey	42075ef30e	allow test to check if server is enterprise	2020-04-30 14:46:21 -04:00
Drew Bailey	acacecc67b	add license reset command to commands help text formatting remove reset no signed option	2020-04-30 14:46:20 -04:00
Drew Bailey	a266284f60	test all commands oss err	2020-04-30 14:46:19 -04:00
Drew Bailey	59b76f90e8	hcl fmt from editor license cli formatting, license endpoints ent only test oss error type assertions	2020-04-30 14:46:18 -04:00
Drew Bailey	74abe6ef48	license cli commands cli changes, formatting	2020-04-30 14:46:17 -04:00
Lang Martin	e32b5b12dd	command: deployment status without a prefix lists deployments (#7821 )	2020-04-28 15:11:32 -04:00
Mahmood Ali	b8fb32f5d2	http: adjust log level for request failure Failed requests due to API client errors are to be marked as DEBUG. The Error log level should be reserved to signal problems with the cluster and are actionable for nomad system operators. Logs due to misbehaving API clients don't represent a system level problem and seem spurius to nomad maintainers at best. These log messages can also be attack vectors for deniel of service attacks by filling servers disk space with spurious log messages.	2020-04-22 16:19:59 -04:00
Mahmood Ali	5b42796f1e	Merge pull request #7704 from hashicorp/b-agent-shutdown-order agent: shutdown agent http server last	2020-04-20 10:37:26 -04:00
Mahmood Ali	4e1366f285	agent: route http logs through hclog Pipe http server log to hclog, so that it uses the same logging format as rest of nomad logs. Also, supports emitting them as json logs, when json formatting is set. The http server logs are emitted as Trace level, as they are typically repsent HTTP client errors (e.g. failed tls handshakes, invalid headers, etc). Though, Panic logs represent server errors and are relayed as Error level.	2020-04-20 10:33:40 -04:00
Jeffrey 'jf' Lim	eab600d3e1	Fix/improve "job plan" messaging (#7580 )	2020-04-17 15:53:16 -04:00
Mahmood Ali	b78680eee7	agent: shutdown agent http server last Shutdown http server last, after nomad client/server components terminate. Before this change, if the agent is taking an unexpectedly long time to shutdown, the operator cannot query the http server directly: they cannot access agent specific http endpoints and need to query another agent about the troublesome agent. Unexpectedly long shutdown can happen in normal cases, e.g. a client might hung is if one of the allocs it is running has a long shutdown_delay. Here, we switch to ensuring that the http server is shutdown last. I believe this doesn't require extra care in agent shutting down logic while operators may be able to submit write http requests. We already need to cope with operators submiting these http requests to another agent or by servers updating the client allocations.	2020-04-13 10:50:07 -04:00

1 2 3 4 5 ...

2729 commits