open-nomad

Commit Graph

Author	SHA1	Message	Date
Sander Mol	42b338308f	add go-sockaddr templating support to nomad consul address (#12084 )	2022-02-24 09:34:54 -05:00
Seth Hoenig	de95998faa	core: switch to go.etc.io/bbolt This PR swaps the underlying BoltDB implementation from boltdb/bolt to go.etc.io/bbolt. In addition, the Server has a new configuration option for disabling NoFreelistSync on the underlying database. Freelist option: https://github.com/etcd-io/bbolt/blob/master/db.go#L81 Consul equivelent PR: https://github.com/hashicorp/consul/pull/11720	2022-02-23 14:26:41 -06:00
Michael Schurter	7494a0c4fd	core: remove all traces of unused protocol version Nomad inherited protocol version numbering configuration from Consul and Serf, but unlike those projects Nomad has never used it. Nomad's `protocol_version` has always been `1`. While the code is effectively unused and therefore poses no runtime risks to leave, I felt like removing it was best because: 1. Nomad's RPC subsystem has been able to evolve extensively without needing to increment the version number. 2. Nomad's HTTP API has evolved extensively without increment `API{Major,Minor}Version`. If we want to version the HTTP API in the future, I doubt this is the mechanism we would choose. 3. The presence of the `server.protocol_version` configuration parameter is confusing since `server.raft_protocol` is an important parameter for operators to consider. Even more confusing is that there is a distinct Serf protocol version which is included in `nomad server members` output under the heading `Protocol`. `raft_protocol` is the only protocol version relevant to Nomad developers and operators. The other protocol versions are either deadcode or have never changed (Serf). 4. If we were to need to version the RPC, HTTP API, or Serf protocols, I don't think these configuration parameters and variables are the best choice. If we come to that point we should choose a versioning scheme based on the use case and modern best practices -- not this 6+ year old dead code.	2022-02-18 16:12:36 -08:00
Luiz Aoqui	de91954582	initial base work for implementing sorting and filter across API endpoints (#12076 )	2022-02-16 14:34:36 -05:00
Luiz Aoqui	110dbeeb9d	Add `go-bexpr` filters to evals and deployment list endpoints (#12034 )	2022-02-16 11:40:30 -05:00
Seth Hoenig	ac3cd73d00	Merge pull request #12054 from hashicorp/b-creation-indexes api: return sorted results in certain list endpoints	2022-02-15 15:08:38 -06:00
Seth Hoenig	40c714a681	api: return sorted results in certain list endpoints These API endpoints now return results in chronological order. They can return results in reverse chronological order by setting the query parameter ascending=true. - Eval.List - Deployment.List	2022-02-15 13:48:28 -06:00
Alex Holyoake	3071c7d91b	config: merge ReservableCores in clientConfig (#12044 )	2022-02-15 08:36:37 -05:00
Luiz Aoqui	3bf6036487	Version 1.2.6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJiBIXqAAoJELC0QQl2hbZ2M8cP/A7LENJbFSph25M1aGItra5j BphSX//Sq/v9ZzO44rOGNYQGfTpFT8STJgj2GC50qR/ilF4KX4D0oZlDyu/6D0NG ouN9RUjnFd6IEDQrjqqqhr3F69Z95SWVfi1rfgn/pIgOYkVEXfi6DXaulVVyd2ZT J0G5w5ryl5d8PhuL7TWw4zbhZRQn0hVspZv/1s3/I9aG6Sew8SMweeOxbN9lBr7E H19Amdjh6ugRuPgU7YMpKDVrZQRv9Wt7BUP/uc0u3LiW9z3Ko8ZKnCRKErtL5Kc3 HDZsWe+t3va4Uekzd0HULNcYU4kwjogdRYRzX5kRsOyXelrZkQIqYFiKrk1wVbq/ cYM5DUak6eUQBGhgi3UY0fklBFq4GDGpiwEzn7rvQb0PRSuVyykgbZ12fzyIu8dp tWbR/WOEg9F+jva6HkR2kDIcr5mDmny3Pxi5aUT6lMk1111nCzOjDzhLkQVtfsex FDMByXxM4oWAK3ouq2OIdxDL2c742A2933C4/30KWE7Xy7twsvkGw52irw66VO3V 4PHP880cDvEDaEh15mY/8FlaAE7t/gsCUuYLxGwl33TaXSRBLc9vVNrrp89q53TD ZcvXTBpHUOWa6ZlHF/4f8LW44rowM6bU0Wili7NaWOKx86dnUJMG4sqJifNgcpS/ 7lXogv98CYLbMy4X4if0 =NY1Z -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmIFbbkACgkQKfRZwNnL tXOr/g/+N2ZBMK8ohEvtdXLl7WXrVhgJfUSVbdD5Kfshul9CPn3yWRxJzqtEN2Pf 55ozeWLpoziP9y9LviJ7rDidXcTmDFutbFdGJ3L+ZLdLILsNOq1A+lbuwO3fJngZ 5aiPoJLsw4sqj6uHaM6Cls2f145O92nT7GXEHCxuvGHeSf3NkcR+zRY5nPrLTIrA uxYefCOzP6C2I+W7dL4Oj5R5EZd4UDi1WiL8pGzwm24LcagZN2ctctolAeF9OlJX M58UUv9b4GObe617u8MeH0LIlyZiNwn9JqrV33dKVTyrkBIYfYxkzdzMKf1csVYk kQb13KPdPTASBAGTl+sxeXXnw/bg09JXGcvREX5lLyQqY8xGwTv2FpTmybKWLiss Bg6BbejrgtCPBik0EAHWV0+kVzhi9bPfUYwTXLDCzMtrbyCyPoWchruel2sm41U1 ezRDzlSvf6nrXf7sAv6umJICck4Bc5Gol+8W7fxvWqnY9rQ3ds2v7E5lXZMBbOmE JSi+EDWBJjBAXehE6pLxeVsvlHMRWN007Z2UeD4neGIgG7xFJLq6nKeUKoiNIpgk hKBL8iwHyuJfrBB/dcPzI9NV+jL6OZ/oI1RWxSj0MX/B4VXZp8HrqZA5JxzQolUg KIxqe4iX3WIkQv+UU4WiELvs4O7fujB4KWz3iQokhwDxqGUpffk= =5EG2 -----END PGP SIGNATURE----- Merge tag 'v1.2.6' into merge-release-1.2.6-branch Version 1.2.6	2022-02-10 14:55:34 -05:00
Luiz Aoqui	15f9d54dea	api: prevent excessice CPU load on job parse Add new namespace ACL requirement for the /v1/jobs/parse endpoint and return early if HCLv2 parsing fails. The endpoint now requires the new `parse-job` ACL capability or `submit-job`.	2022-02-09 19:51:47 -05:00
Thomas Lefebvre	3b57f3af9d	Add config command and config validate subcommand to nomad CLI (#9198 )	2022-02-08 16:52:35 -05:00
Tim Gross	7ad15b2b42	raft: default to protocol v3 (#11572 ) Many of Nomad's Autopilot features require raft protocol version 3. Set the default raft protocol to 3, and improve the upgrade documentation.	2022-02-03 15:03:12 -05:00
Seth Hoenig	db2347a86c	cleanup: prevent leaks from time.After This PR replaces use of time.After with a safe helper function that creates a time.Timer to use instead. The new function returns both a time.Timer and a Stop function that the caller must handle. Unlike time.NewTimer, the helper function does not panic if the duration set is <= 0.	2022-02-02 14:32:26 -06:00
Derek Strickland	460416e787	Update IsEmpty to check for pre-1.2.4 fields (#11930 )	2022-01-28 14:41:49 -05:00
Derek Strickland	b3c8ab9be7	Update IsEmpty to check for pre-1.2.4 fields (#11930 )	2022-01-26 11:31:37 -05:00
Tim Gross	1dad0e597e	fix integer bounds checks (#11815 ) * driver: fix integer conversion error The shared executor incorrectly parsed the user's group into int32 and then cast to uint32 without bounds checking. This is harmless because an out-of-bounds gid will throw an error later, but it triggers security and code quality scans. Parse directly to uint32 so that we get correct error handling. * helper: fix integer conversion error The autopilot flags helper incorrectly parses a uint64 to a uint which is machine specific size. Although we don't have 32-bit builds, this sets off security and code quality scaans. Parse to the machine sized uint. * driver: restrict bounds of port map The plugin server doesn't constrain the maximum integer for port maps. This could result in a user-visible misconfiguration, but it also triggers security and code quality scans. Restrict the bounds before casting to int32 and return an error. * cpuset: restrict upper bounds of cpuset values Our cpuset configuration expects values in the range of uint16 to match the expectations set by the kernel, but we don't constrain the values before downcasting. An underflow could lead to allocations failing on the client rather than being caught earlier. This also make security and code quality scanners happy. * http: fix integer downcast for per_page parameter The parser for the `per_page` query parameter downcasts to int32 without bounds checking. This could result in underflow and nonsensical paging, but there's no server-side consequences for this. Fixing this will silence some security and code quality scanners though.	2022-01-25 11:16:48 -05:00
Seth Hoenig	2a5f7c0386	deps: swap gzip handler for gorilla This has been pinned since the Go modules migration, because the nytimes gzip handler was modified in version v1.1.0 in a way that is no longer compatible. Pretty sure it is this commit: `c551b6c3b4` Instead use handler.CompressHandler from gorilla, which is a web toolkit we already make use of for other things.	2022-01-19 11:52:19 -06:00
Nomad Release bot	de3070d49a	Generate files for 1.2.4 release	2022-01-18 23:43:00 +00:00
Tim Gross	9c4864badd	freebsd: build fix for ARM7 32-bit (#11854 ) The size of `stat_t` fields is architecture dependent, which was reportedly causing a build failure on FreeBSD ARM7 32-bit systems. This changeset matches the behavior we have on Linux.	2022-01-14 12:25:32 -05:00
James Rasell	82b168bf34	Merge pull request #11403 from hashicorp/f-gh-11059 agent/docs: add better clarification when top-level data dir needs setting	2022-01-13 16:41:35 +01:00
Luiz Aoqui	d48e50da9a	Fix log level parsing from lines that include a timestamp (#11838 )	2022-01-13 09:56:35 -05:00
Michael Schurter	e6eff95769	agent: validate reserved_ports are valid Goal is to fix at least one of the causes that can cause a node to be ineligible to receive work: https://github.com/hashicorp/nomad/issues/9506#issuecomment-1002880600	2022-01-12 14:21:47 -08:00
Seth Hoenig	8c97ffd68e	cleanup: stop referencing depreceted HeaderMap field Remove reference to the deprecated ResponseRecorder.HeaderMap field, instead calling .Response.Header() to get the same data. closes #10520	2022-01-12 10:32:54 -06:00
Derek Strickland	0a8e03f0f7	Expose Consul template configuration parameters (#11606 ) This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza. - `wait` It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza. - `max_stale` - `block_query_wait` - `consul_retry` - `vault_retry` - `wait` Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure. - `wait_bounds` Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-01-10 10:19:07 -05:00
Charlie Voiselle	98a240cd99	Make number of scheduler workers reloadable (#11593 ) ## Development Environment Changes * Added stringer to build deps ## New HTTP APIs * Added scheduler worker config API * Added scheduler worker info API ## New Internals * (Scheduler)Worker API refactor—Start(), Stop(), Pause(), Resume() * Update shutdown to use context * Add mutex for contended server data - `workerLock` for the `workers` slice - `workerConfigLock` for the `Server.Config.NumSchedulers` and `Server.Config.EnabledSchedulers` values ## Other * Adding docs for scheduler worker api * Add changelog message Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>	2022-01-06 11:56:13 -05:00
Tim Gross	2806dc2bd7	docs/tests for multiple HTTP address config (#11760 )	2022-01-03 10:17:13 -05:00
Kevin Schoonover	5d9a506bc0	agent: support multiple http address in addresses.http (#11582 )	2022-01-03 09:33:53 -05:00
James Rasell	45f4689f9c	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
Tim Gross	f8a133a810	cli: ensure `-stale` flag is respected by `nomad operator debug` (#11678 ) When a cluster doesn't have a leader, the `nomad operator debug` command can safely use stale queries to gracefully degrade the consistency of almost all its queries. The query parameter for these API calls was not being set by the command. Some `api` package queries do not include `QueryOptions` because they target a specific agent, but they can potentially be forwarded to other agents. If there is no leader, these forwarded queries will fail. Provide methods to call these APIs with `QueryOptions`.	2021-12-15 10:44:03 -05:00
Tim Gross	a0cf5db797	provide `-no-shutdown-delay` flag for job/alloc stop (#11596 ) Some operators use very long group/task `shutdown_delay` settings to safely drain network connections to their workloads after service deregistration. But during incident response, they may want to cause that drain to be skipped so they can quickly shed load. Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and `nomad job stop` commands that bypasses the delay. This sets a new desired transition state on the affected allocations that the allocation/task runner will identify during pre-kill on the client. Note (as documented here) that using this flag will almost always result in failed inbound network connections for workloads as the tasks will exit before clients receive updated service discovery information and won't be gracefully drained.	2021-12-13 14:54:53 -05:00
Tim Gross	624ecab901	evaluations list pagination and filtering (#11648 ) API queries can request pagination using the `NextToken` and `PerPage` fields of `QueryOptions`, when supported by the underlying API. Add a `NextToken` field to the `structs.QueryMeta` so that we have a common field across RPCs to tell the caller where to resume paging from on their next API call. Include this field on the `api.QueryMeta` as well so that it's available for future versions of List HTTP APIs that wrap the response with `QueryMeta` rather than returning a simple list of structs. In the meantime callers can get the `X-Nomad-NextToken`. Add pagination to the `Eval.List` RPC by checking for pagination token and page size in `QueryOptions`. This will allow resuming from the last ID seen so long as the query parameters and the state store itself are unchanged between requests. Add filtering by job ID or evaluation status over the results we get out of the state store. Parse the query parameters of the `Eval.List` API into the arguments expected for filtering in the RPC call.	2021-12-10 13:43:03 -05:00
Tim Gross	03e697a69d	scheduler: config option to reject job registration (#11610 ) During incident response, operators may find that automated processes elsewhere in the organization can be generating new workloads on Nomad clusters that are unable to handle the workload. This changeset adds a field to the `SchedulerConfiguration` API that causes all job registration calls to be rejected unless the request has a management ACL token.	2021-12-06 15:20:34 -05:00
Tim Gross	7770eda3f1	config: fix test-only failures in UI handler setup (#11571 ) The `TestHTTPServer_Limits_Error` test never starts the agent so it had an incomplete configuration, which caused panics in the test. Fix the configuration. The PR #11555 had a branch name like `f-ui-*` which caused CI to skip the unit tests over the HTTP handler setup, so this wasn't caught in PR review.	2021-11-24 16:19:04 -05:00
Tim Gross	fcb96de9a7	config: UI configuration block with Vault/Consul links (#11555 ) Add `ui` block to agent configuration to enable/disable the web UI and provide the web UI with links to Vault/Consul.	2021-11-24 11:20:02 -05:00
James Rasell	751c8217d1	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
Tim Gross	e729133134	api: return 404 for alloc FS list/stat endpoints (#11482 ) * api: return 404 for alloc FS list/stat endpoints If the alloc filesystem doesn't have a file requested by the List Files or Stat File API, we currently return a HTTP 500 error with the expected "file not found" error message. Return a HTTP 404 error instead. * update FS Handler Previously the FS handler would interpret a 500 status as a 404 in the adapter layer by checking if the response body contained the text or is the response status was 500 and then throw an error code for 404. Co-authored-by: Jai Bhagat <jaybhagat841@gmail.com>	2021-11-17 11:15:07 -05:00
Luiz Aoqui	610a8a05e6	Merge release 1.2.0 rc1 branch (#11486 )	2021-11-09 17:55:13 -05:00
Michael Schurter	ef3fc79225	Merge pull request #11334 from hashicorp/f-chroot-skip-allocdir client: never embed alloc_dir in chroot	2021-11-03 16:48:09 -07:00
Luiz Aoqui	5d204c8ced	Revert "Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799 )" (#11433 )	2021-11-02 17:42:52 -04:00
Charlie Voiselle	29e7d46dd9	Making RPC Upgrade mode reloadable. (#11144 ) - Making RPC Upgrade mode reloadable. - Add suggestions from code review - remove spurious comment - switch to require(t,...) form for test. - Add to changelog	2021-11-01 16:30:53 -04:00
James Rasell	4c92a77aac	agent: clarify error info when data dir needs setting.	2021-10-28 15:05:56 +02:00
Mahmood Ali	cdddd64a42	logging: Log the cause behind agent startup failure (#11353 ) Log the failure error when the agent fails to start. Previously, the agent startup failure error would be emitted to the command UI but not logged. So it doesn't get emitted to syslog or `log_file` if they are set, and it makes debugging much harder. Also, logging the error again before exit makes the error more visible: previously, the operator needed to scroll to the top to find the error. On a sample failure, the output will look like: ``` ==> WARNING: Bootstrap mode enabled! Potentially unsafe operation. ==> Loaded configuration from sample-configs/config-bad ==> Starting Nomad agent... ==> Error starting agent: setting up server node ID failed: mkdir /path-without-permission: read-only file system 2021-10-20T14:38:51.179-0400 [WARN] agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/path-without-permission/plugins 2021-10-20T14:38:51.181-0400 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/path-without-permission/plugins 2021-10-20T14:38:51.181-0400 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/path-without-permission/plugins 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=java type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=docker type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=mock_driver type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=exec type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=qemu type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [ERROR] agent: error starting agent: error="setting up server node ID failed: mkdir /path-without-permission: read-only file system" ``` This change adds the final `ERROR` message. It's easy to miss the `==> Error starting agent` above.	2021-10-27 10:41:17 -07:00
Luiz Aoqui	b463715a98	prevent active log from being overwritten when agent starts (#11386 )	2021-10-26 20:57:07 -04:00
Michael Schurter	10c3bad652	client: never embed alloc_dir in chroot Fixes #2522 Skip embedding client.alloc_dir when building chroot. If a user configures a Nomad client agent so that the chroot_env will embed the client.alloc_dir, Nomad will happily infinitely recurse while building the chroot until something horrible happens. The best case scenario is the filesystem's path length limit is hit. The worst case scenario is disk space is exhausted. A bad agent configuration will look something like this: ```hcl data_dir = "/tmp/nomad-badagent" client { enabled = true chroot_env { # Note that the source matches the data_dir "/tmp/nomad-badagent" = "/ohno" # ... } } ``` Note that `/ohno/client` (the state_dir) will still be created but not `/ohno/alloc` (the alloc_dir). While I cannot think of a good reason why someone would want to embed Nomad's client (and possibly server) directories in chroots, there should be no cause for harm. chroots are only built when Nomad runs as root, and Nomad disables running exec jobs as root by default. Therefore even if client state is copied into chroots, it will be inaccessible to tasks. Skipping the `data_dir` and `{client,server}.state_dir` is possible, but this PR attempts to implement the minimum viable solution to reduce risk of unintended side effects or bugs. When running tests as root in a vm without the fix, the following error occurs: ``` === RUN TestAllocDir_SkipAllocDir alloc_dir_test.go:520: Error Trace: alloc_dir_test.go:520 Error: Received unexpected error: Couldn't create destination file /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/testtask/nomad/test/testtask/.../nomad/test/testtask/secrets/.nomad-mount: open /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/.../testtask/secrets/.nomad-mount: file name too long Test: TestAllocDir_SkipAllocDir --- FAIL: TestAllocDir_SkipAllocDir (22.76s) ``` Also removed unused Copy methods on AllocDir and TaskDir structs. Thanks to @eveld for not letting me forget about this!	2021-10-18 09:22:01 -07:00
Luiz Aoqui	130970e12e	Merge missing commits from 1.2.0-beta1 release branch (#11319 )	2021-10-14 16:10:05 -04:00
Charlie Voiselle	cb8e52b5df	Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799 )	2021-10-13 21:23:13 -04:00
Michael Schurter	59fda1894e	Merge pull request #11167 from a-zagaevskiy/master Support configurable dynamic port range	2021-10-13 16:47:38 -07:00
Michael Schurter	e14cd34392	client: improve errors & tests for dynamic ports	2021-10-13 16:25:25 -07:00
Luiz Aoqui	3e0bad5a41	wrap `log` messages with `hclog` (#11291 )	2021-10-12 14:38:44 -04:00
Aleksandr Zagaevskiy	d92666e6a7	fixup! Support configurable dynamic port range	2021-10-11 14:13:59 +03:00
Matt Mukerjee	b56432e645	Add FailoverHeartbeatTTL to config (#11127 ) FailoverHeartbeatTTL is the amount of time to wait after a server leader failure before considering reallocating client tasks. This TTL should be fairly long as the new server leader needs to rebuild the entire heartbeat map for the cluster. In deployments with a small number of machines, the default TTL (5m) may be unnecessary long. Let's allow operators to configure this value in their config files.	2021-10-06 18:48:12 -04:00
Luiz Aoqui	0a62bdc3c5	fix panic when Connect mesh gateway doesn't have a proxy block (#11257 ) Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2021-10-04 15:52:07 -04:00
Mahmood Ali	4d90afb425	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
Michael Schurter	c6e72b6818	client: output reserved ports with min/max ports Also add a little more min/max port testing and add the consts back that had been removed: but unexported and as defaults.	2021-09-30 17:05:46 -07:00
Michael Schurter	4ad0c258b9	client: add NOMAD_LICENSE to default env deny list By default we should not expose the NOMAD_LICENSE environment variable to tasks. Also refactor where the DefaultEnvDenyList lives so we don't have to maintain 2 copies of it. Since client/config is the most obvious location, keep a reference there to its unfortunate home buried deep in command/agent/host. Since the agent uses this list as well for the /agent/host endpoint the list must be accessible from both command/agent and client.	2021-09-21 13:51:17 -07:00
James Rasell	0e926ef3fd	allow configuration of Docker hostnames in bridge mode (#11173 ) Add a new hostname string parameter to the network block which allows operators to specify the hostname of the network namespace. Changing this causes a destructive update to the allocation and it is omitted if empty from API responses. This parameter also supports interpolation. In order to have a hostname passed as a configuration param when creating an allocation network, the CreateNetwork func of the DriverNetworkManager interface needs to be updated. In order to minimize the disruption of future changes, rather than add another string func arg, the function now accepts a request struct along with the allocID param. The struct has the hostname as a field. The in-tree implementations of DriverNetworkManager.CreateNetwork have been modified to account for the function signature change. In updating for the change, the enhancement of adding hostnames to network namespaces has also been added to the Docker driver, whilst the default Linux manager does not current implement it.	2021-09-16 08:13:09 +02:00
Aleksandr Zagaevskiy	ebb87e65fe	Support configurable dynamic port range	2021-09-10 11:52:47 +03:00
James Rasell	b6813f1221	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
Luiz Aoqui	104d29e808	Don't timestamp active log file (#11070 ) * don't timestamp active log file * website: update log_file default value * changelog: add entry for #11070 * website: add upgrade instructions for log_file in v1.14 and v1.2.0	2021-08-23 11:27:34 -04:00
Mahmood Ali	c37339a8c8	Merge pull request #9160 from hashicorp/f-sysbatch core: implement system batch scheduler	2021-08-16 09:30:24 -04:00
Michael Schurter	a7aae6fa0c	Merge pull request #10848 from ggriffiths/listsnapshot_secrets CSI Listsnapshot secrets support	2021-08-10 15:59:33 -07:00
Seth Hoenig	3371214431	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
Mahmood Ali	0bc12fba7c	Only initialize task.VolumeMounts when not-nil (#10990 ) 1.1.3 had a bug where task.VolumeMounts will be an empty slice instead of nil. Eventually, it gets canonicalized and is set to `nil`, but it seems to confuse dry-run planning. The regression was introduced in https://github.com/hashicorp/nomad/pull/10855/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ecL1028-R1037 . Curiously, it's the only place where `len(apiTask.VolumeMounts)` check was dropped. I assume it was dropped accidentally. Fixes #10981	2021-08-02 13:08:10 -04:00
Nomad Release bot	b5dff8be42	Generate files for 1.1.3 release	2021-07-29 03:43:03 +00:00
Grant Griffiths	fecbbaee22	CSI ListSnapshots secrets implementation Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>	2021-07-28 11:30:29 -07:00
Mahmood Ali	996ea1fa46	Merge pull request #10875 from hashicorp/b-namespace-flag-override cli: `-namespace` should override job namespace	2021-07-14 17:28:36 -04:00
Seth Hoenig	7c3db812fd	consul/connect: remove sidecar proxy before removing parent service This PR will have Nomad de-register a sidecar proxy service before attempting to de-register the parent service. Otherwise, Consul will emit a warning and an error. Fixes #10845	2021-07-08 13:30:19 -05:00
Seth Hoenig	2607853a26	Merge pull request #10872 from hashicorp/b-cc-regex-checkids consul/connect: Avoid assumption of parent service when filtering connect proxies	2021-07-08 13:29:40 -05:00
Seth Hoenig	284cd214ec	consul/connect: improve regex from CR suggestions	2021-07-08 13:05:05 -05:00
Tim Gross	a3bc87a2eb	cli: `-namespace` should override job namespace When a jobspec doesn't include a namespace, we provide it with the default namespace, but this ends up overriding the explicit `-namespace` flag. This changeset uses the same logic as region parsing to create an order of precedence: the query string parameter (the `-namespace` flag) overrides the API request body which overrides the jobspec.	2021-07-08 13:17:27 -04:00
Seth Hoenig	868b246128	consul/connect: Avoid assumption of parent service when filtering connect proxies This PR uses regex-based matching for sidecar proxy services and checks when syncing with Consul. Previously we would check if the parent of the sidecar was still being tracked in Nomad. This is a false invariant - one which we must not depend when we make #10845 work. Fixes #10843	2021-07-08 09:43:41 -05:00
Mahmood Ali	1f34f2197b	Merge pull request #10806 from hashicorp/munda/idempotent-job-dispatch Enforce idempotency of dispatched jobs using token on dispatch request	2021-07-08 10:23:31 -04:00
Tim Gross	8f25a9d7cd	cni: respect default `cni_config_dir` and `cni_path` (#10870 ) The default agent configuration values were not set, which meant they were not being set in the client configuration and this results in fingerprints failing unless the values were set explicitly.	2021-07-08 09:56:57 -04:00
Alex Munda	02c1a4d912	Set/parse idempotency_token query param	2021-07-07 16:26:55 -05:00
Seth Hoenig	56a6a1b1df	consul: avoid extra sync operations when no action required This PR makes it so the Consul sync logic will ignore operations that do not specify an action to take (i.e. [de-]register [services\|checks]). Ideally such noops would be discarded at the callsites (i.e. users of [Create\|Update\|Remove]Workload], but we can also be defensive at the commit point. Also adds 2 trace logging statements which are helpful for diagnosing sync operations with Consul - when they happen and why. Fixes #10797	2021-07-07 11:24:56 -05:00
Tim Gross	69a7c9db7e	csi: account for nil volume_mount in API-to-structs conversion (#10855 ) Fix a nil pointer in the API struct to `nomad/structs` conversion when a `volume_mount` block is empty.	2021-07-07 08:06:39 -04:00
Seth Hoenig	0d9208f1a0	consul: set task name only for group service checks This PR fixes a bug introduced in a refactoring https://github.com/hashicorp/nomad/pull/10764/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ec where task level service checks would inherent the task name field, when they shouldn't. Fixes #10781	2021-06-18 12:16:27 -05:00
Seth Hoenig	532b898b07	consul/connect: in-place update service definition when connect upstreams are modified This PR fixes a bug where modifying the upstreams of a Connect sidecar proxy would not result Consul applying the changes, unless an additional change to the job would trigger a task replacement (thus replacing the service definition). The fix is to check if upstreams have been modified between Nomad's view of the sidecar service definition, and the service definition for the sidecar that is actually registered in Consul. Fixes #8754	2021-06-16 16:48:26 -05:00
Seth Hoenig	d75669da4a	consul: make failures_before_critical and success_before_passing work with group services This PR fixes some job submission plumbing to make sure the Consul Check parameters - failure_before_critical - success_before_passing work with group-level services. They already work with task-level services.	2021-06-15 11:20:40 -05:00
James Rasell	939b23936a	Merge pull request #10744 from hashicorp/b-remove-duplicate-imports chore: remove duplicate import statements	2021-06-11 16:42:34 +02:00
James Rasell	492e308846	tests: remove duplicate import statements.	2021-06-11 09:39:22 +02:00
Nomad Release bot	7cc7389afd	Generate files for 1.1.1 release	2021-06-10 08:04:25 -04:00
Mahmood Ali	aa77c2731b	tests: use standard library testing.TB Glint pulled in an updated version of mitchellh/go-testing-interface which broke some existing tests because the update added a Parallel() method to testing.T. This switches to the standard library testing.TB which doesn't have a Parallel() method.	2021-06-09 16:18:45 -07:00
Isabel Suchanek	dfaef2468c	cli: add monitor flag to deployment status Adding '-verbose' will print out the allocation information for the deployment. This also changes the job run command so that it now blocks until deployment is complete and adds timestamps to the output so that it's more in line with the output of node drain. This uses glint to print in place in running in a tty. Because glint doesn't yet support cmd/powershell, Windows workflows use a different library to print in place, which results in slightly different formatting: 1) different margins, and 2) no spinner indicating deployment in progress.	2021-06-09 16:18:45 -07:00
Seth Hoenig	dbdc479970	consul: move consul acl tests into ent files (cherry-pick ent back to oss) This PR moves a lot of Consul ACL token validation tests into ent files, so that we can verify correct behavior difference between OSS and ENT Nomad versions.	2021-06-09 08:38:42 -05:00
Seth Hoenig	d656777dd7	Merge pull request #10720 from hashicorp/f-cns-acl-check consul: correctly check consul acl token namespace when using consul oss	2021-06-08 15:43:42 -05:00
Seth Hoenig	87be8c4c4b	consul: correctly check consul acl token namespace when using consul oss This PR fixes the Nomad Object Namespace <-> Consul ACL Token relationship check when using Consul OSS (or Consul ENT without namespace support). Nomad v1.1.0 introduced a regression where Nomad would fail the validation when submitting Connect jobs and allow_unauthenticated set to true, with Consul OSS - because it would do the namespace check against the Consul ACL token assuming the "default" namespace, which does not work because Consul OSS does not have namespaces. Instead of making the bad assumption, expand the namespace check to handle each special case explicitly. Fixes #10718	2021-06-08 13:55:57 -05:00
Seth Hoenig	209e2d6d81	consul: pr cleanup namespace probe function signatures	2021-06-07 15:41:01 -05:00
Seth Hoenig	519429a2de	consul: probe consul namespace feature before using namespace api This PR changes Nomad's wrapper around the Consul NamespaceAPI so that it will detect if the Consul Namespaces feature is enabled before making a request to the Namespaces API. Namespaces are not enabled in Consul OSS, and require a suitable license to be used with Consul ENT. Previously Nomad would check for a 404 status code when makeing a request to the Namespaces API to "detect" if Consul OSS was being used. This does not work for Consul ENT with Namespaces disabled, which returns a 500. Now we avoid requesting the namespace API altogether if Consul is detected to be the OSS sku, or if the Namespaces feature is not licensed. Since Consul can be upgraded from OSS to ENT, or a new license applied, we cache the value for 1 minute, refreshing on demand if expired. Fixes https://github.com/hashicorp/nomad-enterprise/issues/575 Note that the ticket originally describes using attributes from https://github.com/hashicorp/nomad/issues/10688. This turns out not to be possible due to a chicken-egg situation between bootstrapping the agent and setting up the consul client. Also fun: the Consul fingerprinter creates its own Consul client, because there is no [currently] no way to pass the agent's client through the fingerprint factory.	2021-06-07 12:19:25 -05:00
Jasmine Dahilig	ca4be6857e	deployment query rate limit (#10706 )	2021-06-04 12:38:46 -07:00
Seth Hoenig	d026ff1f66	consul/connect: add support for connect mesh gateways This PR implements first-class support for Nomad running Consul Connect Mesh Gateways. Mesh gateways enable services in the Connect mesh to make cross-DC connections via gateways, where each datacenter may not have full node interconnectivity. Consul docs with more information: https://www.consul.io/docs/connect/gateways/mesh-gateway The following group level service block can be used to establish a Connect mesh gateway. service { connect { gateway { mesh { // no configuration } } } } Services can make use of a mesh gateway by configuring so in their upstream blocks, e.g. service { connect { sidecar_service { proxy { upstreams { destination_name = "<service>" local_bind_port = <port> datacenter = "<datacenter>" mesh_gateway { mode = "<mode>" } } } } } } Typical use of a mesh gateway is to create a bridge between datacenters. A mesh gateway should then be configured with a service port that is mapped from a host_network configured on a WAN interface in Nomad agent config, e.g. client { host_network "public" { interface = "eth1" } } Create a port mapping in the group.network block for use by the mesh gateway service from the public host_network, e.g. network { mode = "bridge" port "mesh_wan" { host_network = "public" } } Use this port label for the service.port of the mesh gateway, e.g. service { name = "mesh-gateway" port = "mesh_wan" connect { gateway { mesh {} } } } Currently Envoy is the only supported gateway implementation in Consul. By default Nomad client will run the latest official Envoy docker image supported by the local Consul agent. The Envoy task can be customized by setting `meta.connect.gateway_image` in agent config or by setting the `connect.sidecar_task` block. Gateways require Consul 1.8.0+, enforced by the Nomad scheduler. Closes #9446	2021-06-04 08:24:49 -05:00
Mahmood Ali	0f5539c382	exec: http: close websocket connection gracefully In this loop, we ought to close the websocket connection gracefully when the StreamErrWrapper reaches EOF. Previously, it's possible that that we drop the last few events or skip sending the websocket closure. If `handler(handlerPipe)` returns and `cancel` is called, before the loop here completes processing streaming events, the loop exits prematurely without propagating the last few events. Instead here, the loop continues until we hit `httpPipe` EOF (through `decoder.Decode`), to ensure we process the events to completion.	2021-05-24 13:37:23 -04:00
Tim Gross	084a46e0e5	agent: surface websocket errors in logs The websocket interface used for `alloc exec` has to silently drop client send errors because otherwise those errors would interleave with the streamed output. But we may be able to surface errors that cause terminated websockets a little better in the HTTP server logs.	2021-05-24 09:46:45 -04:00
Nomad Release bot	5be44af07d	Generate files for 1.1.0-rc1 release	2021-05-12 22:43:48 +00:00
Chris Baker	263ddd567c	Node Drain Metadata (#10250 )	2021-05-07 13:58:40 -04:00
Mahmood Ali	102763c979	Support disabling TCP checks for connect sidecar services	2021-05-07 12:10:26 -04:00
Mahmood Ali	4b95f6ef42	api: actually set MemoryOversubscriptionEnabled (#10493 )	2021-05-02 22:53:53 -04:00
Seth Hoenig	d54a606819	Merge pull request #10439 from hashicorp/pick-ent-acls-changes e2e: add e2e tests for consul namespaces on ent with acls	2021-04-28 08:30:08 -06:00
Tim Gross	79f81d617e	licensing: remove raft storage and sync This changeset is the OSS portion of the work to remove the raft storage and sync for Nomad Enterprise.	2021-04-28 10:28:23 -04:00
Seth Hoenig	09cd01a5f3	e2e: add e2e tests for consul namespaces on ent with acls This PR adds e2e tests for Consul Namespaces for Nomad Enterprise with Consul ACLs enabled. Needed to add support for Consul ACL tokens with `namespace` and `namespace_prefix` blocks, which Nomad parses and validates before tossing the token. These bits will need to be picked back to OSS.	2021-04-27 14:45:54 -06:00
Mahmood Ali	ed4aad458c	api: Ignore User provided ParentID (#10424 ) ParentID is an internal field that Nomad sets for dispatched or parameterized jobs. Job submitters should not be able to set it directly, as that messes up children tracking. Fixes #10422 . It specifically stops the scheduler from honoring the ParentID. The reason failure and why the scheduler didn't schedule that job once it was created is very interesting and requires follow up with a more technical issue.	2021-04-23 16:22:17 -04:00
Charlie Voiselle	ef8ca60693	Enable go-sockaddr templating for `network-interface` (#10404 ) Add templating to `network-interface` option. This PR also adds a fast-fail to in the case where an invalid interface is set or produced by the template * add tests and check for valid interface * Add documentation * Incorporate suggestions from code review Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2021-04-20 13:55:10 -04:00
Seth Hoenig	4e6dbaaec1	Merge pull request #10184 from hashicorp/f-fuzzy-search api: implement fuzzy search API	2021-04-20 09:06:40 -06:00
Seth Hoenig	509490e5d2	e2e: consul namespace tests from nomad ent (cherry-picked from ent without _ent things) This is part 2/4 of e2e tests for Consul Namespaces. Took a first pass at what the parameterized tests can look like, but only on the ENT side for this PR. Will continue to refactor in the next PRs. Also fixes 2 bugs: - Config Entries registered by Nomad Server on job registration were not getting Namespace set - Group level script checks were not getting Namespace set Those changes will need to be copied back to Nomad OSS. Nomad OSS + no ACLs (previously, needs refactor) Nomad ENT + no ACLs (this) Nomad OSS + ACLs (todo) Nomad ENT + ALCs (todo)	2021-04-19 15:35:31 -06:00
Seth Hoenig	1ee8d5ffc5	api: implement fuzzy search API This PR introduces the /v1/search/fuzzy API endpoint, used for fuzzy searching objects in Nomad. The fuzzy search endpoint routes requests to the Nomad Server leader, which implements the Search.FuzzySearch RPC method. Requests to the fuzzy search API are based on the api.FuzzySearchRequest object, e.g. { "Text": "ed", "Context": "all" } Responses from the fuzzy search API are based on the api.FuzzySearchResponse object, e.g. { "Index": 27, "KnownLeader": true, "LastContact": 0, "Matches": { "tasks": [ { "ID": "redis", "Scope": [ "default", "example", "cache" ] } ], "evals": [], "deployment": [], "volumes": [], "scaling_policy": [], "images": [ { "ID": "redis:3.2", "Scope": [ "default", "example", "cache", "redis" ] } ] }, "Truncations": { "volumes": false, "scaling_policy": false, "evals": false, "deployment": false } } The API is tunable using the new server.search stanza, e.g. server { search { fuzzy_enabled = true limit_query = 200 limit_results = 1000 min_term_length = 5 } } These values can be increased or decreased, so as to provide more search results or to reduce load on the Nomad Server. The fuzzy search API can be disabled entirely by setting `fuzzy_enabled` to `false`.	2021-04-16 16:36:07 -06:00
Nick Ethier	339c671e29	agent: add test for reserved core config mapping	2021-04-13 13:28:15 -04:00
Nick Ethier	edc0da9040	client: only fingerprint reservable cores via cgroups, allowing manual override for other platforms	2021-04-13 13:28:15 -04:00
Nick Ethier	bed4e92b61	fingerprint: implement client fingerprinting of reservable cores on Linux systems this is derived from the configure cpuset cgroup parent (defaults to /nomad) for non Linux systems and Linux systems where cgroups are not enabled, the client defaults to using all cores	2021-04-13 13:28:15 -04:00
Mahmood Ali	6bd2600cd0	Merge pull request #10370 from alrs/command-agent-errs command/agent: fix dropped test errors	2021-04-13 11:40:12 -04:00
Nick Spain	653d84ef68	Add a 'body' field to the check stanza Consul allows specifying the HTTP body to send in a health check. Nomad uses Consul for health checking so this just plumbs the value through to where the Consul API is called. There is no validation that `body` is not used with an incompatible check method like GET.	2021-04-13 09:15:35 -04:00
Lars Lehtonen	d2e7f31906	command/agent: fix dropped test errors	2021-04-13 01:51:24 -07:00
Tim Gross	0892d34ff9	CSI: capability block is required for volume registration	2021-04-08 13:02:24 -04:00
Tim Gross	7d16e49a14	CSI: fix wrong output struct for snapshot list endpoint	2021-04-07 12:00:33 -04:00
Tim Gross	e4f34a96e3	CSI: deletes with API don't have request body Our API client `delete` method doesn't include a request body, but accepts an interface for the response. We were accidentally putting the request body into the response, which doesn't get picked up in unit tests because we're not reading the (always empty) response body anyways.	2021-04-07 12:00:33 -04:00
Tim Gross	34a7b9da5c	CSI: fix wrong RPC name on ListSnapshots	2021-04-07 12:00:33 -04:00
Tim Gross	8af5bd1ad4	CSI: fix decoding error on snapshot create Consumers of the CSI HTTP API are expecting a response object and not a slice of snapshots. Fix the return value.	2021-04-07 12:00:33 -04:00
Tim Gross	69363705a8	CSI: fix HTTP routing for external volume list The HTTP router did not correctly route `/v1/volumes/external` without being explicitly added to the top-level router. Break this out into its own request handler.	2021-04-07 12:00:22 -04:00
Tim Gross	276633673d	CSI: use AccessMode/AttachmentMode from CSIVolumeClaim Registration of Nomad volumes previously allowed for a single volume capability (access mode + attachment mode pair). The recent `volume create` command requires that we pass a list of requested capabilities, but the existing workflow for claiming volumes and attaching them on the client assumed that the volume's single capability was correct and unchanging. Add `AccessMode` and `AttachmentMode` to `CSIVolumeClaim`, use these fields to set the initial claim value, and add backwards compatibility logic to handle the existing volumes that already have claims without these fields.	2021-04-07 11:24:09 -04:00
Seth Hoenig	fe8fce00d9	consul: minor CR cleanup	2021-04-05 10:10:16 -06:00
Seth Hoenig	f17ba33f61	consul: plubming for specifying consul namespace in job/group This PR adds the common OSS changes for adding support for Consul Namespaces, which is going to be a Nomad Enterprise feature. There is no new functionality provided by this changeset and hopefully no new bugs.	2021-04-05 10:03:19 -06:00
Yoan Blanc	ac0d5d8bd3	chore: bump golangci-lint from v1.24 to v1.39 Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2021-04-03 09:50:23 +02:00
Chris Baker	b11a092d2d	added missing import from command/agent	2021-04-02 13:53:28 +00:00
Chris Baker	21bc48ca29	json handles were moved to a new package in #10202 this was unecessary after refactoring, so this moves them back to their original location in package structs	2021-04-02 13:31:10 +00:00
Chris Baker	436d46bd19	Merge branch 'main' into f-node-drain-api	2021-04-01 15:22:57 -05:00
Tim Gross	466b620fa4	CSI: volume snapshot	2021-04-01 11:16:52 -04:00
Tim Gross	71b9daffb9	CSI: fix misleading HTTP test The HTTP test to create CSI volumes depends on having a controller plugin to talk to, but the test was using a node-only plugin, which allows it to silently ignore the missing controller.	2021-03-31 16:37:09 -04:00
Tim Gross	aec5337862	CSI: HTTP handlers for create/delete/list	2021-03-31 16:37:09 -04:00
Tim Gross	43622680fa	test infrastructure for mock client RPCs (#10193 ) This commit includes a new test client that allows overriding the RPC protocols. Only the RPCs that are passed in are registered, which lets you implement a mock RPC in the server tests. This commit includes an example of this for the ClientCSI RPC server.	2021-03-31 16:37:09 -04:00
Mahmood Ali	18b581656d	oversubscription: adds CLI and API support This commit updates the API to pass the MemoryMaxMB field, and the CLI to show the max set for the task. Also, start parsing the MemoryMaxMB in hcl2, as it's set by tags. A sample CLI output; note the additional `Max: ` for "task": ``` $ nomad alloc status 96fbeb0b ID = 96fbeb0b-a0b3-aa95-62bf-b8a39492fd5c [...] Task "cgroup-fetcher" is "running" Task Resources CPU Memory Disk Addresses 0/500 MHz 32 MiB/20 MiB 300 MiB Task Events: [...] Task "task" is "running" Task Resources CPU Memory Disk Addresses 0/500 MHz 176 KiB/20 MiB 300 MiB Max: 30 MiB Task Events: [...] ```	2021-03-30 16:55:58 -04:00
Nick Ethier	daecfa61e6	Merge pull request #10203 from hashicorp/f-cpu-cores Reserved Cores [1/4]: Structs and scheduler implementation	2021-03-29 14:05:54 -04:00
Tim Gross	b0d2eed932	redirect from HTTP root to UI should include query params The OTT feature relies on having a query parameter for a one-time token which gets handled by the UI. We need to make sure that query param is preserved when redirecting from the root URL to the `/ui/` URI.	2021-03-26 14:54:41 -04:00
Chris Baker	770c9cecb5	restored Node.Sanitize() for RPC endpoints multiple other updates from code review	2021-03-26 17:03:15 +00:00
Chris Baker	ff0b9a4d3e	added benchmark test for JSON encoding extensions	2021-03-23 20:23:06 +00:00
Chris Baker	cb540ed691	added tests that the API doesn't leak Node.SecretID added more documentation on JSON encoding to the contributing guide	2021-03-23 18:09:20 +00:00
Drew Bailey	74836b95b2	configuration and oss components for licensing (#10216 ) * configuration and oss components for licensing * vendor sync	2021-03-23 09:08:14 -04:00
Chris Baker	a186badf35	moved JSON handlers and extension code around a bit for proper order of initialization	2021-03-22 14:12:42 +00:00
Chris Baker	dd291e69f4	removed deprecated fields from Drain structs and API node drain: use msgtype on txn so that events are emitted wip: encoding extension to add Node.Drain field back to API responses new approach for hiding Node.SecretID in the API, using `json` tag documented this approach in the contributing guide refactored the JSON handlers with extensions modified event stream encoding to use the go-msgpack encoders with the extensions	2021-03-21 15:30:11 +00:00
Nick Ethier	26b200e8bd	api: add new 'cores' field to task resources	2021-03-18 23:13:30 -04:00
Tim Gross	fa25e048b2	CSI: unique volume per allocation Add a `PerAlloc` field to volume requests that directs the scheduler to test feasibility for volumes with a source ID that includes the allocation index suffix (ex. `[0]`), rather than the exact source ID. Read the `PerAlloc` field when making the volume claim at the client to determine if the allocation index suffix (ex. `[0]`) should be added to the volume source ID.	2021-03-18 15:35:11 -04:00
Tim Gross	9b2b580d1a	CSI: remove prefix matching from CSIVolumeByID and fix CLI prefix matching (#10158 ) Callers of `CSIVolumeByID` are generally assuming they should receive a single volume. This potentially results in feasibility checking being performed against the wrong volume if a volume's ID is a prefix substring of other volume (for example: "test" and "testing"). Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix matching in the command line client. Add the required elements for prefix matching to the commands and API.	2021-03-18 14:32:40 -04:00
Charlie Voiselle	0473f35003	Fixup uses of `sanity` (#10187 ) * Fixup uses of `sanity` * Remove unnecessary comments. These checks are better explained by earlier comments about the context of the test. Per @tgross, moved the tests together to better reinforce the overall shared context. * Update nomad/fsm_test.go	2021-03-16 18:05:08 -04:00
James Rasell	e9d81ace7b	Merge pull request #10140 from hashicorp/b-gh-10070 agent: return req error if prometheus metrics are disabled.	2021-03-10 17:11:39 +01:00
Tim Gross	75878f978e	HTTP API support for 'nomad ui -login' Endpoints for requesting and exchanging one-time tokens via the HTTP API. Includes documentation updates.	2021-03-10 08:17:56 -05:00
James Rasell	782350bd19	agent: return req error if prometheus metrics are disabled. If the user has disabled Prometheus metrics and a request is sent to the metrics endpoint requesting Prometheus formatted metrics, then the request should fail.	2021-03-09 15:28:58 +01:00
Andre Ilhicas	f45fc6c899	consul/connect: enable setting local_bind_address in upstream	2021-02-26 11:37:31 +00:00
Drew Bailey	86d9e1ff90	Merge pull request #9955 from hashicorp/on-update-services Service and Check on_update configuration option (readiness checks)	2021-02-24 10:11:05 -05:00
Seth Hoenig	d557d6bf94	consul/connect: Fix bug where connect sidecar services would be unnecessarily re-registered This PR fixes a bug where sidecar services would be re-registered into Consul every ~30 seconds, caused by the parent service having its tags field set and the sidecar_service tags unset. Nomad would directly compare the tags between its copy of the sidecar service definition and the tags of the sidecar service reported by Consul. This does not work, because Consul will under-the-hood set the sidecar service tags to inherit the parent service tags if the sidecar service tags are unset. The comparison then done by Nomad would not match, if the parent sidecar tags are set. Fixes #10025	2021-02-22 12:02:58 -06:00
AndrewChubatiuk	157f8de903	pr comments	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	cb84d8ef11	fixed typo	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	4f49614760	fixed typo	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	c4bece2b09	fixed typo	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	cb527c5d83	fixed tests	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	abb7fd1e36	fixed connect port label	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	3e11860b6d	fixed connect port label	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	cd152643fb	fixed connect port label	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	af7a16d774	tests fix	2021-02-13 02:42:14 +02:00
AndrewChubatiuk	3d0aa2ef56	allocate sidecar task port on host_network interface	2021-02-13 02:42:13 +02:00
AndrewChubatiuk	47c690479c	fixed tests	2021-02-13 02:42:13 +02:00
AndrewChubatiuk	46c639c5d6	fixed tests	2021-02-13 02:42:13 +02:00
AndrewChubatiuk	77f9b84a30	fixed tests	2021-02-13 02:42:13 +02:00
AndrewChubatiuk	99201412da	removed proxy suffix	2021-02-13 02:42:13 +02:00
AndrewChubatiuk	844ac16900	fixed variable initialization	2021-02-13 02:42:13 +02:00
AndrewChubatiuk	78465bbd23	customized default sidecar checks	2021-02-13 02:42:13 +02:00
Buck Doyle	c22d1114d8	Add handling for license requests in OSS (#9963 ) This changes the license-fetching endpoint to respond with 204 in OSS instead of 501. It closes #9827.	2021-02-08 12:53:06 -06:00
Drew Bailey	82f971f289	OnUpdate configuration for services and checks Allow for readiness type checks by configuring nomad to ignore warnings or errors reported by a service check. This allows the deployment to progress and while Consul handles introducing the sercive into a resource pool once the check passes.	2021-02-08 08:32:40 -05:00
Chris Baker	ce68ee164b	Version 1.0.3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJgEuOKAAoJEFGFLYc0j/xMxF8H/3TTU6Tu+Xm0YvcsDaYDphZ/ X7KQBV0aFiuL5VkTw4PzKEsgryIy9/sqEPyxxyKRowAmos9qhiusjNAIfqdP4TF8 tdZmTedkfWir9uPD+hyv/LXpwbQ2T8kTwS3xHTYvaOmaCxZr710FEn+imnMk1AUn Xs5itkd/CYGr0nBLm+I5GutWSDPmL7Uw8J5Z30fFyoaxoCPAbCWQQNk793SCRUc5 f/uo18V2tFInmQ+3sAdnM4gPewyStK/a5VvzWavL9fVDtYK83wlqWSchTXY5jpVz zNEzt/rYhbBzakPQQKb5zieblh2iGI8aHWpD5w4WduqO2Sg6B/5lAeNZIlW0UJg= =2g3c -----END PGP SIGNATURE----- Merge tag 'v1.0.3' into post-release-1.0.3 Version 1.0.3	2021-01-29 19:30:08 +00:00
Seth Hoenig	8b05efcf88	consul/connect: Add support for Connect terminating gateways This PR implements Nomad built-in support for running Consul Connect terminating gateways. Such a gateway can be used by services running inside the service mesh to access "legacy" services running outside the service mesh while still making use of Consul's service identity based networking and ACL policies. https://www.consul.io/docs/connect/gateways/terminating-gateway These gateways are declared as part of a task group level service definition within the connect stanza. service { connect { gateway { proxy { // envoy proxy configuration } terminating { // terminating-gateway configuration entry } } } } Currently Envoy is the only supported gateway implementation in Consul. The gateay task can be customized by configuring the connect.sidecar_task block. When the gateway.terminating field is set, Nomad will write/update the Configuration Entry into Consul on job submission. Because CEs are global in scope and there may be more than one Nomad cluster communicating with Consul, there is an assumption that any terminating gateway defined in Nomad for a particular service will be the same among Nomad clusters. Gateways require Consul 1.8.0+, checked by a node constraint. Closes #9445	2021-01-25 10:36:04 -06:00
Drew Bailey	630babb886	prevent double job status update (#9768 ) * Prevent Job Statuses from being calculated twice https://github.com/hashicorp/nomad/pull/8435 introduced atomic eval insertion iwth job (de-)registration. This change removes a now obsolete guard which checked if the index was equal to the job.CreateIndex, which would empty the status. Now that the job regisration eval insetion is atomic with the registration this check is no longer necessary to set the job statuses correctly. * test to ensure only single job event for job register * periodic e2e * separate job update summary step * fix updatejobstability to use copy instead of modified reference of job * update envoygatewaybindaddresses copy to prevent job diff on null vs empty * set ConsulGatewayBindAddress to empty map instead of nil fix nil assertions for empty map rm unnecessary guard	2021-01-22 09:18:17 -05:00
Dennis Schön	3eaf1432aa	validate connect block allowed only within group.service	2021-01-20 14:34:23 -05:00
Dennis Schön	c376545f49	don't prefix json logging	2021-01-20 09:09:31 -05:00
Nomad Release bot	f36d983863	Generate files for 1.0.2 release	2021-01-13 16:52:51 +00:00
Nick Ethier	6705f845f2	Merge pull request #9739 from hashicorp/b-alloc-netmode-ports Use port's to value when building service address under 'alloc' addr_mode	2021-01-07 09:16:27 -05:00
Nick Ethier	bb060bd46b	command/agent/consul: remove duplicated tests	2021-01-06 14:11:31 -05:00
Kris Hicks	868ba0cea5	consul: Refactor parts of UpdateWorkload (#9737 ) This removes modification of ops in methods that UpdateWorkload calls, keeping them local to UpdateWorkload. It also includes some rewrites of checkRegs for clarity.	2021-01-06 11:11:28 -08:00
Nick Ethier	ab01e19df3	command/agent/consul: use port's to value when building service address under 'alloc' addr_mode	2021-01-06 13:52:48 -05:00
Seth Hoenig	6c9366986b	consul/connect: avoid NPE from unset connect gateway proxy Submitting a job with an ingress gateway in host networking mode with an absent gateway.proxy block would cause the Nomad client to panic on NPE. The consul registration bits would assume the proxy stanza was not nil, but it could be if the user does not supply any manually configured envoy proxy settings. Check the proxy field is not nil before using it. Fixes #9669	2021-01-05 09:27:01 -06:00
Seth Hoenig	40d36fc0ec	agent: revert use of http connlimit https://github.com/hashicorp/nomad/pull/9608 introduced the use of the built-in HTTP 429 response handler provided by go-connlimit. There is concern though around plausible DOS attacks that need to be addressed, so this PR reverts that functionality. It keeps a fix in the tests around the use of an HTTPS enabled client for when the server is listening on HTTPS. Previously, the tests would fail deterministically with io.EOF because that's how the TLS server terminates invalid connections. Now, the result is much less deterministic. The state of the client connection and the server socket depends on when the connection is closed and how far along the handshake was.	2020-12-14 14:40:14 -06:00
Seth Hoenig	0091325721	command: give flag-helpers a better name	2020-12-14 10:07:27 -06:00
Seth Hoenig	d3f1c3adcf	Merge pull request #9608 from hashicorp/f-go-connlimit Use go-connlimit to ratelimit with 429 responses	2020-12-10 11:05:07 -06:00
Seth Hoenig	a28cd45988	client: fix plumbing of testing object into helper	2020-12-10 11:04:38 -06:00
Kris Hicks	0cf9cae656	Apply some suggested fixes from staticcheck (#9598 )	2020-12-10 07:29:18 -08:00
Seth Hoenig	2cc5787f97	client: fix https test cases in client rate limits	2020-12-10 09:20:28 -06:00
Ben Buzbee	a8e4aa76c6	Use new go-connlimit with HTTP 429 response This is essentially a port of Consul's similar fix Changes are: go get -u github.com/hashicorp/go-connlimit go mod vendor Use new HTTP429 handler `20d1ea7d2d`	2020-12-09 17:57:16 -06:00
Kris Hicks	0a3a748053	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
Kris Hicks	93155ba3da	Add gocritic to golangci-lint config (#9556 )	2020-12-08 12:47:04 -08:00
Dennis Schön	038f2cce57	return 405 on non-GET requests to /v1/event/stream (fixes #9526 ) (#9564 )	2020-12-08 13:09:20 -05:00
Dave May	be0a14d70b	fix AgentHostRequest panic found in GH-9546 (#9554 ) * debug: refactor nodeclass test * debug: add case to track down SIGSEGV on client to server Agent.Host RPC * verify server to avoid panic on AgentHostRequest RPC call, fixes GH-9546 * simplify Agent.Host RPC lookup logic	2020-12-07 17:34:40 -05:00
Dennis Schön	a9c97d9257	use os.ErrDeadlineExceeded in tests	2020-12-07 10:40:28 -05:00
Drew Bailey	9adca240f8	Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447 ) * upsertaclpolicies * delete acl policies msgtype * upsert acl policies msgtype * delete acl tokens msgtype * acl bootstrap msgtype wip unsubscribe on token delete test that subscriptions are closed after an ACL token has been deleted Start writing policyupdated test * update test to use before/after policy * add SubscribeWithACLCheck to run acl checks on subscribe * update rpc endpoint to use broker acl check * Add and use subscriptions.closeSubscriptionFunc This fixes the issue of not being able to defer unlocking the mutex on the event broker in the for loop. handle acl policy updates * rpc endpoint test for terminating acl change * add comments Co-authored-by: Kris Hicks <khicks@hashicorp.com>	2020-12-01 11:11:34 -05:00
Drew Bailey	a0b7f05a7b	Remove Managed Sinks from Nomad (#9470 ) * Remove Managed Sinks from Nomad Managed Sinks were a beta feature in Nomad 1.0-beta2. During the beta period it was determined that this was not a scalable approach to support community and third party sinks. * update comment * changelog	2020-11-30 14:00:31 -05:00
Seth Hoenig	e81e9223ef	consul/connect: enable setting datacenter in connect upstream Before, upstreams could only be defined using the default datacenter. Now, the `datacenter` field can be set in a connect upstream definition, informing consul of the desire for an instance of the upstream service in the specified datacenter. The field is optional and continues to default to the local datacenter. Closes #8964	2020-11-30 10:38:30 -06:00
Tim Gross	4e79ddea45	csi/api: populate ReadAllocs/WriteAllocs fields (#9377 ) The API is missing values for `ReadAllocs` and `WriteAllocs` fields, resulting in allocation claims not being populated in the web UI. These fields mirror the fields in `nomad/structs.CSIVolume`. Returning a separate list of stubs for read and write would be ideal, but this can't be done without either bloating the API response with repeated full `Allocation` data, or causing a panic in previous versions of the CLI. The `nomad/structs` fields are persisted with nil values and are populated during RPC, so we'll do the same in the HTTP API and populate the `ReadAllocs` and `WriteAllocs` fields with a map of allocation IDs, but with null values. The web UI will then create its `ReadAllocations` and `WriteAllocations` fields by mapping from those IDs to the values in `Allocations`, instead of flattening the map into a list.	2020-11-25 16:44:06 -05:00
Seth Hoenig	3c17dc2ecc	api: safely access legacy MBits field	2020-11-23 10:36:10 -06:00
Nick Ethier	9471892df4	mock: add default host network	2020-11-23 10:11:00 -06:00
Chris Baker	7c6071e6c4	updated alloc_endpoint to mutate a copy of the returned allocation, instead of the instance in the state store	2020-11-15 17:52:50 +00:00
Chris Baker	b244d5e949	documenting test for #9367	2020-11-15 17:47:50 +00:00
Seth Hoenig	bb8a5816a0	jobspec: add support for headers in artifact stanza This PR adds the ability to set HTTP headers when downloading an artifact from an `http` or `https` resource. The implementation in `go-getter` is such that a new `HTTPGetter` must be created for each artifact that sets headers (as opposed to conveniently setting headers per-request). This PR maintains the memoization of the default Getter objects, creating new ones only for artifacts where headers are set. Closes #9306	2020-11-13 12:03:54 -06:00
Seth Hoenig	b19bc6be2b	consul: prevent re-registration churn by correctly comparing sidecar tags Previously, connect sidecars would be re-registered with consul every cycle of Nomad's reconciliation loop around Consul service registrations. This is because part of the comparison used `reflect.DeepEqual` on []string objects, which returns false when one object is `[]string{}` and the other is `[]string{}(nil)`. Unforunately, this was always the case, and every Connect sidecar service would be re-registered on every iteration, which happens every 30 seconds.	2020-11-11 18:01:17 -06:00
Nick Ethier	5e1634eda1	structs: canonicalize allocatedtaskresources to populate shared ports (#9309 )	2020-11-11 16:21:47 -05:00
Tim Gross	60874ebe25	csi: Postrun hook should not change mode (#9323 ) The unpublish workflow requires that we know the mode (RW vs RO) if we want to unpublish the node. Update the hook and the Unpublish RPC so that we mark the claim for release in a new state but leave the mode alone. This fixes a bug where RO claims were failing node unpublish. The core job GC doesn't know the mode, but we don't need it for that workflow, so add a mode specifically for GC; the volumewatcher uses this as a sentinel to check whether claims (with their specific RW vs RO modes) need to be claimed.	2020-11-11 13:06:30 -05:00
Mahmood Ali	69849a42a5	Merge pull request #9298 from hashicorp/f-hcl2-localsvars HCL2: Variables and Locals	2020-11-09 16:44:37 -05:00
Luiz Aoqui	ea81ac5d3d	Merge pull request #9296 from hashicorp/b-remove-namespace-from-scale-request Remove Namespace field from JobScaleRequest	2020-11-09 15:13:33 -05:00
Mahmood Ali	1ae3e8a8eb	Start using the new jobspec2 API	2020-11-09 15:01:31 -05:00
Luiz Aoqui	c536286c7a	remove Namespace field from JobScaleRequest	2020-11-09 13:02:05 -05:00
Nick Ethier	04f5c4ee5f	ar/groupservice: remove drivernetwork (#9233 ) * ar/groupservice: remove drivernetwork * consul: allow host address_mode to accept raw port numbers * consul: fix logic for blank address	2020-11-05 15:00:22 -05:00
Kris Hicks	1da9e7fc67	Add event sink API and CLI commands (#9226 ) Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>	2020-11-02 09:57:35 -08:00
Chris Baker	719077a26d	added new policy capabilities for recommendations API state store: call-out to generic update of job recommendations from job update method recommendations API work, and http endpoint errors for OSS support for scaling polices in task block of job spec add query filters for ScalingPolicy list endpoint command: nomad scaling policy list: added -job and -type	2020-10-28 14:32:16 +00:00
Drew Bailey	86080e25a9	Send events to EventSinks (#9171 ) * Process to send events to configured sinks This PR adds a SinkManager to a server which is responsible for managing managed sinks. Managed sinks subscribe to the event broker and send events to a sink writer (webhook). When changes to the eventstore are made the sinkmanager and managed sink are responsible for reloading or starting a new managed sink. * periodically check in sink progress to raft Save progress on the last successfully sent index to raft. This allows a managed sink to resume close to where it left off in the event of a lost server or leadership change dereference eventsink so we can accurately use the watchch When using a pointer to eventsink struct it was updated immediately and our reload logic would not trigger	2020-10-26 17:27:54 -04:00
Drew Bailey	1ae39a9ed9	event sink crud operation api (#9155 ) * network sink rpc/api plumbing state store methods and restore upsert sink test get sink delete sink event sink list and tests go generate new msg types validate sink on upsert * go generate	2020-10-23 14:23:00 -04:00
Michael Schurter	c2dd9bc996	core: open source namespaces	2020-10-22 15:26:32 -07:00
Mahmood Ali	059e87c862	Merge pull request #9142 from hashicorp/f-hclv2-2.3 Support HCLv2 for Nomad jobs	2020-10-22 12:26:28 -05:00
Drew Bailey	f3dcefe5a9	remove event durability (#9147 ) * remove event durability temporarily removing go-memdb event durability until a new strategy is developed on how to best handled increased durability needs * drop events table schema and state store methods * fix neweventbuffer invocations	2020-10-22 12:21:03 -04:00
Mahmood Ali	f52bda4c30	api: update /render api to parse hclv2	2020-10-21 15:46:57 -04:00
Drew Bailey	6c788fdccd	Events/msgtype cleanup (#9117 ) * use msgtype in upsert node adds message type to signature for upsert node, update tests, remove placeholder method * UpsertAllocs msg type test setup * use upsertallocs with msg type in signature update test usage of delete node delete placeholder msgtype method * add msgtype to upsert evals signature, update test call sites with test setup msg type handle snapshot upsert eval outside of FSM and ignore eval event remove placeholder upsertevalsmsgtype handle job plan rpc and prevent event creation for plan msgtype cleanup upsertnodeevents updatenodedrain msgtype msg type 0 is a node registration event, so set the default to the ignore type * fix named import * fix signature ordering on upsertnode to match	2020-10-19 09:30:15 -04:00
Drew Bailey	fba0d6dc6a	event buffer size and durable count must be non negative	2020-10-15 16:34:33 -04:00
Nick Ethier	4903e5b114	Consul with CNI and host_network addresses (#9095 ) * consul: advertise cni and multi host interface addresses * structs: add service/check address_mode validation * ar/groupservices: fetch networkstatus at hook runtime * ar/groupservice: nil check network status getter before calling * consul: comment network status can be nil	2020-10-15 15:32:21 -04:00
Michael Schurter	ea55c497b7	Merge pull request #9094 from hashicorp/f-1.0 s/0.13/1.0/g	2020-10-15 08:53:33 -07:00
James Rasell	42a6e7140f	Merge pull request #9083 from hashicorp/b-fix-enterprise-config-merge agent: fix enterprise config overlay merging.	2020-10-15 08:40:49 +02:00
Michael Schurter	9c3972937b	s/0.13/1.0/g 1.0 here we come!	2020-10-14 15:17:47 -07:00
Michael Schurter	dd09fa1a4a	Merge pull request #9055 from hashicorp/f-9017-resources api: add field filters to /v1/{allocations,nodes}	2020-10-14 14:49:39 -07:00
Michael Schurter	6890cffd7a	unify boolean parameter parsing	2020-10-14 12:23:25 -07:00
Drew Bailey	c463479848	filter on additional filter keys, remove switch statement duplication properly wire up durable event count move newline responsibility moves newline creation from NDJson to the http handler, json stream only encodes and sends now ignore snapshot restore if broker is disabled enable dev mode to access event steam without acl use mapping instead of switch use pointers for config sizes, remove unused ttl, simplify closed conn logic	2020-10-14 14:14:33 -04:00
Michael Schurter	8ccbd92cb6	api: add field filters to /v1/{allocations,nodes} Fixes #9017 The ?resources=true query parameter includes resources in the object stub listings. Specifically: - For `/v1/nodes?resources=true` both the `NodeResources` and `ReservedResources` field are included. - For `/v1/allocations?resources=true` the `AllocatedResources` field is included. The ?task_states=false query parameter removes TaskStates from /v1/allocations responses. (By default TaskStates are included.)	2020-10-14 10:35:22 -07:00
Drew Bailey	684807bddb	namespace filtering	2020-10-14 12:44:43 -04:00
Drew Bailey	df96b89958	Add EvictCallbackFn to handle removing entries from go-memdb when they are removed from the event buffer. Wire up event buffer size config, use pointers for structs.Events instead of copying.	2020-10-14 12:44:42 -04:00
Drew Bailey	315f77a301	rehydrate event publisher on snapshot restore address pr feedback	2020-10-14 12:44:41 -04:00
Drew Bailey	d793529d61	event durability count and cfg	2020-10-14 12:44:40 -04:00
Drew Bailey	b4c135358d	use Events to wrap index and events, store in events table	2020-10-14 12:44:39 -04:00
Drew Bailey	559517455a	wire up enable_event_publisher	2020-10-14 12:44:38 -04:00
Drew Bailey	4793bb4e01	Events/deployment events (#9004 ) * Node Drain events and Node Events (#8980) Deployment status updates handle deployment status updates (paused, failed, resume) deployment alloc health generate events from apply plan result txn err check, slim down deployment event one ndjson line per index * consolidate down to node event + type * fix UpdateDeploymentAllocHealth test invocations * fix test	2020-10-14 12:44:37 -04:00
Drew Bailey	a4a2975edf	Event Stream API/RPC (#8947 ) This Commit adds an /v1/events/stream endpoint to stream events from. The stream framer has been updated to include a SendFull method which does not fragment the data between multiple frames. This essentially treats the stream framer as a envelope to adhere to the stream framer interface in the UI. If the `encode` query parameter is omitted events will be streamed as newline delimted JSON.	2020-10-14 12:44:36 -04:00
James Rasell	e0734bed77	agent: fix enterprise config overlay merging.	2020-10-14 09:35:16 +02:00
Chris Baker	1d35578bed	removed backwards-compatible/untagged metrics deprecated in 0.7	2020-10-13 20:18:39 +00:00
Seth Hoenig	ed13e5723f	consul/connect: dynamically select envoy sidecar at runtime As newer versions of Consul are released, the minimum version of Envoy it supports as a sidecar proxy also gets bumped. Starting with the upcoming Consul v1.9.X series, Envoy v1.11.X will no longer be supported. Current versions of Nomad hardcode a version of Envoy v1.11.2 to be used as the default implementation of Connect sidecar proxy. This PR introduces a change such that each Nomad Client will query its local Consul for a list of Envoy proxies that it supports (https://github.com/hashicorp/consul/pull/8545) and then launch the Connect sidecar proxy task using the latest supported version of Envoy. If the `SupportedProxies` API component is not available from Consul, Nomad will fallback to the old version of Envoy supported by old versions of Consul. Setting the meta configuration option `meta.connect.sidecar_image` or setting the `connect.sidecar_task` stanza will take precedence as is the current behavior for sidecar proxies. Setting the meta configuration option `meta.connect.gateway_image` will take precedence as is the current behavior for connect gateways. `meta.connect.sidecar_image` and `meta.connect.gateway_image` may make use of the special `${NOMAD_envoy_version}` variable interpolation, which resolves to the newest version of Envoy supported by the Consul agent. Addresses #8585 #7665	2020-10-13 09:14:12 -05:00
Yoan Blanc	891accb89a	use allow/deny instead of the colored alternatives (#9019 ) Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-10-12 08:47:05 -04:00
Seth Hoenig	6cffbecb3a	Merge pull request #9033 from pierreca/verify-remove-checks Do not double-remove checks removed by Consul	2020-10-06 10:16:13 -05:00
James Rasell	ffe6533ad1	Merge pull request #9027 from hashicorp/f-gh-9026 cli: move tests to use NewMockUi func.	2020-10-06 08:28:18 +02:00
Pierre Cauchois	1efe05f516	Do not double-remove checks removed by Consul When deregistering a service, consul also deregisters the associated checks. The current state keeps track of all services and all checks separately and deregisters them in sequence, which leads, whether during syncs or shutdowns, to check deregistrations happening twice and failing the second time (generating errors in logs) This fix includes: - a fix to the sync logic that just pulls the checks after the services have been synced - a fix to the shutdown mechanism that gets an updated list of checks after deregistering the services, so that we get a cleaner check deregistration process.	2020-10-06 00:30:29 +00:00
Chris Baker	7f701fddd0	updated docs and validation to further prohibit null chars in region, datacenter, and job name	2020-10-05 18:01:50 +00:00
James Rasell	2ed78b8a7e	cli: move tests to use NewMockUi func.	2020-10-05 16:07:41 +02:00
Kent 'picat' Gruber	5e1c716835	Merge pull request #8998 from hashicorp/keygen-32-bytes Use 32-byte key for gossip encryption to enable AES-256	2020-10-02 17:17:55 -04:00
Kent 'picat' Gruber	b03f79700c	Fix panic in test due to the agent's logger not being initialized yet So a null logger is used to avoid the problem.	2020-10-02 11:10:27 -04:00
Fredrik Hoem Grelland	953d4de8dd	update consul-template to v0.25.1 (#8988 )	2020-10-01 14:08:49 -04:00
Kent 'picat' Gruber	90e85f9add	Fix other usages of initKeyring func to use logger as third argument	2020-10-01 11:13:06 -04:00
Kent 'picat' Gruber	b98bb99dfe	Log AES-128 and AES-192 key sizes during keyring initialization	2020-10-01 11:12:14 -04:00
Michael Schurter	765473e8b0	jobspec: lower min cpu resources from 10->1 Since CPU resources are usually a soft limit it is desirable to allow setting it as low as possible to allow tasks to run only in "idle" time. Setting it to 0 is still not allowed to avoid potential unintentional side effects with allowing a zero value. While there may not be any side effects this commit attempts to minimize risk by avoiding the issue. This does not change the defaults.	2020-09-30 12:15:13 -07:00
Michael Schurter	1544341f09	Merge pull request #8862 from hashicorp/release-0.12.4 Prepare for 0.13 development cycle	2020-09-10 09:14:44 -07:00
Mahmood Ali	d4f385d6e1	Upgrade to golang 1.15 (#8858 ) Upgrade to golang 1.15 Starting with golang 1.5, setting Ctty value result in `Setctty set but Ctty not valid in child` error, as part of https://github.com/golang/go/issues/29458 . This commit lifts the fix in https://github.com/creack/pty/pull/97 .	2020-09-09 15:59:29 -04:00
Nomad Release bot	3b8a2f22dc	Generate files for 0.12.4-rc1 release	2020-09-03 02:59:23 +00:00
Tim Gross	b77fe023b5	MRD: move 'job stop -global' handling into RPC (#8776 ) The initial implementation of global job stop for MRD looped over all the regions in the CLI for expedience. This changeset includes the OSS parts of moving this into the RPC layer so that API consumers don't have to implement this logic themselves.	2020-08-28 14:28:13 -04:00

... 3 4 5 6 7 ...

2108 Commits