open-nomad

Commit Graph

Author	SHA1	Message	Date
Shantanu Gadgil	6cb8c95534	fingerprint kernel architecture name (#13182 )	2022-06-02 15:51:00 -04:00
Seth Hoenig	0399b7e4c5	Merge pull request #12951 from jorgemarey/f-srv-tagged-addresses Allow setting tagged addresses on services	2022-06-01 10:51:49 -05:00
Anthony	d5e084b297	docs: added note about vault -period flag (#13185 )	2022-05-31 14:26:03 -07:00
Seth Hoenig	54efec5dfe	docs: add docs and tests for tagged_addresses	2022-05-31 13:02:48 -05:00
Toyam Cox	06b934ccf6	docs: make the example for 'load' work (#13102 )	2022-05-27 08:48:58 -04:00
Seth Hoenig	c2ba1e2e29	Merge pull request #13125 from hashicorp/b-connect-upstream-namespace connect: enable setting connect upstream destination namespace	2022-05-26 10:29:11 -05:00
Seth Hoenig	4631045d83	connect: enable setting connect upstream destination namespace	2022-05-26 09:39:36 -05:00
Amier Chery	05274c9c9f	Merge pull request #13083 from josegonzalez/patch-1 Update service.check.task definition to match code	2022-05-26 10:38:49 -04:00
Michael Schurter	2965dc6a1a	artifact: fix numerous go-getter security issues Fix numerous go-getter security issues: - Add timeouts to http, git, and hg operations to prevent DoS - Add size limit to http to prevent resource exhaustion - Disable following symlinks in both artifacts and `job run` - Stop performing initial HEAD request to avoid file corruption on retries and DoS opportunities. Approach Since Nomad has no ability to differentiate a DoS-via-large-artifact vs a legitimate workload, all of the new limits are configurable at the client agent level. The max size of HTTP downloads is also exposed as a node attribute so that if some workloads have large artifacts they can specify a high limit in their jobspecs. In the future all of this plumbing could be extended to enable/disable specific getters or artifact downloading entirely on a per-node basis.	2022-05-24 16:29:39 -04:00
PinkLolicorn	83dd9e801e	docs: `mount_flags` takes a slice of strings (#13087 ) The description of `mount_flags` provides incorrect example of the accepted value format. This fixes the issue by changing the example from a string `ro,noatime` to a slice of strings `["ro", "noatime"]`.	2022-05-20 09:16:17 -04:00
Jose Diaz-Gonzalez	fa1077fbcd	docs: correct where task cannot be defined	2022-05-19 21:24:58 -04:00
Jose Diaz-Gonzalez	ea01fe398f	Update service.check.task definition to match code Nomad errors out when attempting to specify a task for a service that uses consul connect but does not have script or gRPC checks. See `304d0cf595/nomad/structs/structs.go (L6643)` for details.	2022-05-19 20:54:49 -04:00
Seth Hoenig	fc58f4972c	cli: correctly use and validate job with vault token set This PR fixes `job validate` to respect '-vault-token', '$VAULT_TOKEN', '-vault-namespace' if set.	2022-05-19 12:13:34 -05:00
Seth Hoenig	65f7abf2f4	cli: update default redis and use nomad service discovery Closes #12927 Closes #12958 This PR updates the version of redis used in our examples from 3.2 to 7. The old version is very not supported anymore, and we should be setting a good example by using a supported version. The long-form example job is now fixed so that the service stanza uses nomad as the service discovery provider, and so now the job runs without a requirement of having Consul running and configured.	2022-05-17 10:24:19 -05:00
Luiz Aoqui	fea13f39b3	docs: add Consul 1.12.0 upgrade notice	2022-05-16 18:44:26 -04:00
Karan Sharma	e0be868b79	docs: Fix typo in sidecar_service (#13021 )	2022-05-16 09:35:42 +02:00
Tim Gross	6e5d6eb3b5	docs: note that already-dispatched jobs cannot be updated (#12973 )	2022-05-12 16:18:42 -04:00
Michael Schurter	5a43d3c675	docs: add `sysbatch` to scheduling internals (#12954 )	2022-05-11 17:06:17 -07:00
Chetan Sarva	14752cd2c0	docs: add version note to nomad services template (#12910 )	2022-05-06 17:39:27 +02:00
Tim Gross	26b9f88ef3	docs: add missing `set_contains_any` constraint docs (#12886 ) This constraint and affinity was added in 0.9.x but was only documented for affinities. Close that documentation gap.	2022-05-05 11:11:05 -04:00
Tim Gross	45b238ec82	CSI: node drain should end once only plugins remain (#12846 ) In #12324 we made it so that plugins wait until the node drain is complete, as we do for system jobs. But we neglected to mark the node drain as complete once only plugins (or system jobs) remaining, which means that the node drain is left in a draining state until the `deadline` time expires. This was incorrectly documented as expected behavior in #12324.	2022-05-03 10:20:22 -04:00
Matus Goljer	a741cc76b5	nomad can also install autocomplete for fish shell (#12834 )	2022-05-02 09:26:55 -04:00
Tim Gross	d06ad50538	docs: clarify `capacity_min/max` for volumes (#12825 ) The capacity fields for `create volume` set bounds on the resulting size of the volume, but the ultimate size of the volume will be determined by the storage provider (between the min and max). Clarify this in the documentation and provide a suggestion for how to set a exact size.	2022-04-29 13:38:30 -04:00
Derek Strickland	584bf0162f	docs: Add known limitations callouts to Max Client Disconnect section (#12801 ) * docs: Add known limitations callouts to Max Client Disconnect section	2022-04-28 16:17:14 -04:00
Tim Gross	c763c4cb96	remove pre-0.9 driver code and related E2E test (#12791 ) This test exercises upgrades between 0.8 and Nomad versions greater than 0.9. We have not supported 0.8.x in a very long time and in any case the test has been marked to skip because the downloader doesn't work.	2022-04-27 09:53:37 -04:00
Michael Schurter	1256c8ef66	docs: update json jobs docs (#12766 ) * docs: update json jobs docs Did you know that Nomad has not 1 but 2 JSON formats for jobs? 2½ if you want to acknowledge that sometimes our JSON job representations have a Job top-level wrapper and sometimes do not. The 2½ formats are: ``` 1. HCL JSON 2. Input API JSON (top-level Job field) 2.5. Output API JSON (lacks top-level Job field) ``` `#2` is what our docs consider our API JSON. `#2.5` seems to be an accident of history we can't fix with breaking API compatibility. `#1` is an even more interesting accident of history: the `jobspec2` package automatically detects if the input to Parse is JSON and switches to a JSON parser. This behavior is undocumented, the format is unspecified, and there is no official HashiCorp tooling to produce this JSON from HCL. The plot thickens when you discover popular third party tools like hcl2json.com and https://github.com/tmccombs/hcl2json seem to produce JSON that `nomad run` accepts! Since we have no telemetry around whether or not anyone passes HCL JSON to `nomad run`, and people don't file bugs around features that Just Work, I'm choosing to leave that code path in place and acknowledged but not suggested in documentation. See https://github.com/hashicorp/hcl/issues/498 for a more comprehensive discussion of what officially supporting HCL JSON in Nomad would look like. (I also added some of the missing fields to the (Input API flavor) JSON Job documentation, but it still needs a lot of work to be comprehensive.) Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-04-22 15:57:27 -07:00
Luiz Aoqui	a8cc633156	vault: revert support for entity aliases (#12723 ) After a more detailed analysis of this feature, the approach taken in PR #12449 was found to be not ideal due to poor UX (users are responsible for setting the entity alias they would like to use) and issues around jobs potentially masquerading itself as another Vault entity.	2022-04-22 10:46:34 -04:00
Seth Hoenig	c4aab10e53	services: cr followup	2022-04-22 09:14:29 -05:00
Seth Hoenig	3fcac242c6	services: enable setting arbitrary address value in service registrations This PR introduces the `address` field in the `service` block so that Nomad or Consul services can be registered with a custom `.Address.` to advertise. The address can be an IP address or domain name. If the `address` field is set, the `service.address_mode` must be set in `auto` mode.	2022-04-22 09:14:29 -05:00
James Rasell	b5d10bcece	docs: add upgrade note for Consul implicit constraint. (#12749 )	2022-04-22 15:53:27 +02:00
James Rasell	046831466c	cli: add pagination flags to service info command. (#12730 )	2022-04-22 10:32:40 +02:00
Michael Schurter	5db3a671db	cli: add -json flag to support job commands (#12591 ) * cli: add -json flag to support job commands While the CLI has always supported running JSON jobs, its support has been via HCLv2's JSON parsing. I have no idea what format it expects the job to be in, but it's absolutely not in the same format as the API expects. So I ignored that and added a new -json flag to explicitly support API style JSON jobspecs. The jobspecs can even have the wrapping {"Job": {...}} envelope or not! * docs: fix example for `nomad job validate` We haven't been able to validate inside driver config stanzas ever since the move to task driver plugins. 😭	2022-04-21 13:20:36 -07:00
Tim Gross	f4287c870d	cli: detect directory when applying namespace spec file (#12738 ) The new `namespace apply` feature that allows for passing a namespace specification file detects the difference between an empty namespace and a namespace specification by checking if the file exists. For most cases, the file will have an extension like `.hcl` and so there's little danger that a user will apply a file spec when they intended to apply a file name. But because directory names typically don't include an extension, you're much more likely to collide when trying to `namespace apply` by name only, and then you get a confusing error message of the form: Failed to read file: read $namespace: is a directory Detect the case where the namespace name collides with a directory in the current working directory, and skip trying to load the directory.	2022-04-21 14:53:45 -04:00
James Rasell	716b8e658b	api: Add support for filtering and pagination to the node list endpoint (#12727 )	2022-04-21 17:04:33 +02:00
Tim Gross	79a9d788d2	docs: fix broken link from `template` to client config (#12733 )	2022-04-21 11:04:04 -04:00
James Rasell	c4195c452a	docs: update HCL2 dynamic example to use block with label. (#12715 )	2022-04-21 10:18:04 +02:00
Seth Hoenig	df587d8263	docs: update documentation with connect acls changes This PR updates the changelog, adds notes the 1.3 upgrade guide, and updates the connect integration docs with documentation about the new requirement on Consul ACL policies of Consul agent default anonymous ACL tokens.	2022-04-18 08:22:33 -05:00
Michael Schurter	70a04dd106	docs: add plan for node rejected details and more (#12564 ) - Moved federation docs to the bottom since everyone is potentially affected by the other sections on the page, but only users of federation are affected by it. - Added section on the plan for node rejected bug since it is fairly easy to diagnose and removing affected nodes is a fairly reliable workaround. - Mention 5s cliff for wait_for_index. - Remove the lie that we do not have job status metrics! How old was that?! - Reinforce the importance of monitoring basic system resources	2022-04-14 16:09:33 -07:00
Seth Hoenig	a1c4f16cf1	connect: prefix tag with nomad.; merge into envoy_stats_tags; update docs This PR expands on the work done in #12543 to - prefix the tag, so it is now "nomad.alloc_id" to be more consistent with Consul tags - merge into pre-existing envoy_stats_tags fields - update the upgrade guide docs - update changelog	2022-04-14 12:52:52 -05:00
Seth Hoenig	a75bc27601	docs: fixup title formatting in upgrade guide	2022-04-08 11:50:54 -05:00
Luiz Aoqui	0190f378a7	docs: fix upgrade specific broken link and conflict tag (#12521 )	2022-04-08 12:36:47 -04:00
James Rasell	6ac5fd9768	docs: add nomad services template jobspec example. (#12514 )	2022-04-08 17:29:19 +02:00
Seth Hoenig	e7aa81d3cb	docs: tweak hcl2 validation example	2022-04-08 08:43:42 -05:00
Thomas Wunderlich	3f6465f078	Add custom variable validation to docs Custom variable validation is a useful feature that is supported by Nomad and not just Terraform. As such it should be documented on the input variable page. I've cribbed the content from the terraform docs so this should be consistent across projects	2022-04-07 19:06:06 -04:00
Jasmine Dahilig	386f2fac3a	docs: add token_last_renewal and token_next_renewal to server metrics and key metrics #12435 (#12505 )	2022-04-07 15:12:41 -07:00
Tim Gross	09b5e8d388	Fix flaky `operator debug` test (#12501 ) We introduced a `pprof-interval` argument to `operator debug` in #11938, and unfortunately this has resulted in a lot of test flakes. The actual command in use is mostly fine (although I've fixed some quirks here), so what's really happened is that the change has revealed some existing issues in the tests. Summary of changes: * Make first pprof collection synchronous to preserve the existing behavior for the common case where the pprof interval matches the duration. * Clamp `operator debug` pprof timing to that of the command. The `pprof-duration` should be no more than `duration` and the `pprof-interval` should be no more than `pprof-duration`. Clamp the values rather than throwing errors, which could change the commands that existing users might already have in debugging scripts * Testing: remove test parallelism The `operator debug` tests that stand up servers can't be run in parallel, because we don't have a way of canceling the API calls for pprof. The agent will still be running the last pprof when we exit, and that breaks the next test that talks to that same agent. (Because you can only run one pprof at a time on any process!) We could split off each subtest into its own server, but this test suite is already very slow. In future work we should fix this "for real" by making the API call cancelable. * Testing: assert against unexpected errors in `operator debug` tests. If we assert there are no unexpected error outputs, it's easier for the developer to debug when something is going wrong with the tests because the error output will be presented as a failing test, rather than just a failing exit code check. Or worse, no failing exit code check! This also forces us to be explicit about which tests will return 0 exit codes but still emit (presumably ignorable) error outputs. Additional minor bug fixes (mostly in tests) and test refactorings: * Fix text alignment on pprof Duration in `operator debug` output * Remove "done" channel from `operator debug` event stream test. The goroutine we're blocking for here already tells us it's done by sending a value, so block on that instead of an extraneous channel * Event stream test timer should start at current time, not zero * Remove noise from `operator debug` test log output. The `t.Logf` calls already are picked out from the rest of the test output by being prefixed with the filename. * Remove explicit pprof args so we use the defaults clamped from duration/interval	2022-04-07 15:00:07 -04:00
Jasmine Dahilig	f67b108f9f	docs: update vault-token note in job run command #8040 (#12385 )	2022-04-06 10:01:38 -07:00
James Rasell	7096fecd10	website: add initial website docs for Nomad service discovery. (#12456 )	2022-04-06 18:51:14 +02:00
Derek Strickland	0ab89b1728	Merge pull request #12476 from hashicorp/f-disconnected-client-allocation-handling disconnected clients: Feature branch merge	2022-04-06 10:11:57 -04:00
Mike Nomitch	7405ebbad1	Add max client disconnect docs (#12467 ) Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>	2022-04-06 08:54:14 -04:00
Tim Gross	5b9772e68f	docs: updates for CSI plugin improvements for 1.3.0 (#12466 )	2022-04-05 17:13:51 -04:00
Derek Strickland	d7f44448e1	disconnected clients: Observability plumbing (#12141 ) * Add disconnects/reconnect to log output and emit reschedule metrics * TaskGroupSummary: Add Unknown, update StateStore logic, add to metrics	2022-04-05 17:12:23 -04:00
Shishir	a6801f73d1	cli: add -quiet to nomad node status command. (#12426 )	2022-04-05 15:53:43 -04:00
Luiz Aoqui	ab7eb5de6e	Support Vault entity aliases (#12449 ) Move some common Vault API data struct decoding out of the Vault client so it can be reused in other situations. Make Vault job validation its own function so it's easier to expand it. Rename the `Job.VaultPolicies` method to just `Job.Vault` since it returns the full Vault block, not just their policies. Set `ChangeMode` on `Vault.Canonicalize`. Add some missing tests. Allows specifying an entity alias that will be used by Nomad when deriving the task Vault token. An entity alias assigns an indentity to a token, allowing better control and management of Vault clients since all tokens with the same indentity alias will now be considered the same client. This helps track Nomad activity in Vault's audit logs and better control over Vault billing. Add support for a new Nomad server configuration to define a default entity alias to be used when deriving Vault tokens. This default value will be used if the task doesn't have an entity alias defined.	2022-04-05 14:18:10 -04:00
Grant Griffiths	18a0a2c9a4	CSI: Add secrets flag support for delete volume (#11245 )	2022-04-05 08:59:11 -04:00
Danish Prakash	e7e8ce212e	command/operator_debug: add pprof interval (#11938 )	2022-04-04 15:24:12 -04:00
Seth Hoenig	f9b0ffafde	Merge pull request #12431 from hashicorp/docs-sysbatch-exists-typo docs: fix typo in system batch description	2022-04-01 09:58:06 -05:00
Seth Hoenig	e9eacb1153	docs: fix typo in system batch description	2022-04-01 09:46:03 -05:00
Tim Gross	8dccc43c2f	docs: remove deprecated client options parameters docs (#12416 ) The client configuration options for drivers have been deprecated since 0.9. We haven't torn them out completely but because they're deprecated it's been hard to guarantee correct behavior. Remove the documentation so that users aren't misled about their viability.	2022-03-31 11:45:51 -04:00
Michael Schurter	cae69ba8ce	Merge pull request #12312 from hashicorp/f-writeToFile template: disallow `writeToFile` by default	2022-03-29 13:41:59 -07:00
Tim Gross	03c1904112	csi: allow `namespace` field to be passed in volume spec (#12400 ) Use the volume spec's `namespace` field to override the value of the `-namespace` and `NOMAD_NAMESPACE` field, just as we do with job spec.	2022-03-29 14:46:39 -04:00
Michael Schurter	33fe04ff6a	template: fix comments and docs Review notes from @lgfa29 Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-03-29 09:25:23 -07:00
Michael Schurter	7a28fcb8af	template: disallow `writeToFile` by default Resolves #12095 by WONTFIXing it. This approach disables `writeToFile` as it allows arbitrary host filesystem writes and is only a small quality of life improvement over multiple `template` stanzas. This approach has the significant downside of leaving people who have altered their `template.function_denylist` still vulnerable! I added an upgrade note, but we should have implemented the denylist as a `map[string]bool` so that new funcs could be denied without overriding custom configurations. This PR also includes a bug fix that broke enabling all consul-template funcs. We repeatedly failed to differentiate between a nil (unset) denylist and an empty (allow all) one.	2022-03-28 17:05:42 -07:00
Shishir	afcce3eea5	Display OS name in nomad node status command. (#12388 ) Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2022-03-28 09:28:14 -04:00
Hunter Morris	dcaf99dcc1	client: Add AWS EC2 instance-life-cycle from metadata to client fingerprint (#12371 )	2022-03-25 11:50:52 -04:00
Luiz Aoqui	848a3b271f	docs: fix link and add note about Nomad v1.3.0 on raft v3 upgrade (#12378 )	2022-03-25 10:11:46 -04:00
dgotlieb	f53f61c6ce	Add grpc and http2 listeners to gateway docs (#12367 ) Stating at Nomad version 1.2.0 `grpc` and `http2` [protocols are supported](https://github.com/hashicorp/nomad/pull/11187)	2022-03-24 17:09:19 -04:00
Seth Hoenig	987dda3092	Merge pull request #12274 from hashicorp/f-cgroupsv2 client: enable cpuset support for cgroups.v2	2022-03-24 14:22:54 -05:00
Seth Hoenig	113b7eb727	client: cgroups v2 code review followup	2022-03-24 13:40:42 -05:00
Tim Gross	ff1bed38cd	csi: add `-secret` and `-parameter` flag to `volume snapshot create` (#12360 ) Pass-through the `-secret` and `-parameter` flags to allow setting parameters for the snapshot and overriding the secrets we've stored on the CSI volume in the state store.	2022-03-24 10:29:50 -04:00
Seth Hoenig	2e5c6de820	client: enable support for cgroups v2 This PR introduces support for using Nomad on systems with cgroups v2 [1] enabled as the cgroups controller mounted on /sys/fs/cgroups. Newer Linux distros like Ubuntu 21.10 are shipping with cgroups v2 only, causing problems for Nomad users. Nomad mostly "just works" with cgroups v2 due to the indirection via libcontainer, but not so for managing cpuset cgroups. Before, Nomad has been making use of a feature in v1 where a PID could be a member of more than one cgroup. In v2 this is no longer possible, and so the logic around computing cpuset values must be modified. When Nomad detects v2, it manages cpuset values in-process, rather than making use of cgroup heirarchy inheritence via shared/reserved parents. Nomad will only activate the v2 logic when it detects cgroups2 is mounted at /sys/fs/cgroups. This means on systems running in hybrid mode with cgroups2 mounted at /sys/fs/cgroups/unified (as is typical) Nomad will continue to use the v1 logic, and should operate as before. Systems that do not support cgroups v2 are also not affected. When v2 is activated, Nomad will create a parent called nomad.slice (unless otherwise configured in Client conifg), and create cgroups for tasks using naming convention <allocID>-<task>.scope. These follow the naming convention set by systemd and also used by Docker when cgroups v2 is detected. Client nodes now export a new fingerprint attribute, unique.cgroups.version which will be set to 'v1' or 'v2' to indicate the cgroups regime in use by Nomad. The new cpuset management strategy fixes #11705, where docker tasks that spawned processes on startup would "leak". In cgroups v2, the PIDs are started in the cgroup they will always live in, and thus the cause of the leak is eliminated. [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html Closes #11289 Fixes #11705 #11773 #11933	2022-03-23 11:35:27 -05:00
Tim Gross	60cfeacd76	drainer: defer CSI plugins until last (#12324 ) When a node is drained, system jobs are left until last so that operators can rely on things like log shippers running even as their applications are getting drained off. Include CSI plugins in this set so that Controller plugins deployed as services can be handled as gracefully as Node plugins that are running as system jobs.	2022-03-22 10:26:56 -04:00
Luiz Aoqui	68e5b58007	cli: display Raft version in `server members` (#12317 ) The previous output of the `nomad server members` command would output a column named `Protocol` that displayed the Serf protocol being currently used by servers. This is not a configurable option, so it holds very little value to operators. It is also easy to confuse it with the Raft Protocol version, which is configurable and highly relevant to operators. This commit replaces the previous `Protocol` column with the new `Raft Version`. It also updates the `-detailed` flag to be called `-verbose` so it matches other commands. The detailed output now also outputs the same information as the standard output with the addition of the previous `Protocol` column and `Tags`.	2022-03-17 14:15:10 -04:00
Luiz Aoqui	8db12c2a17	server: transfer leadership in case of error (#12293 ) When a Nomad server becomes the Raft leader, it must perform several actions defined in the establishLeadership function. If any of these actions fail, Raft will think the node is the leader, but it will not actually be able to act as a Nomad leader. In this scenario, leadership must be revoked and transferred to another server if possible, or the node should retry the establishLeadership steps.	2022-03-17 11:10:57 -04:00
Tim Gross	3bf948dc00	docs: clarify `restart` inheritance and add examples (#12275 ) Clarify the behavior of `restart` inheritance with respect to Connect sidecar tasks. Remove incorrect language about the scheduler being involved in restart decisions. Try to make the `delay` mode documentation more clear, and provide examples of delay vs fail.	2022-03-14 15:49:08 -04:00
Tim Gross	5ae30849a9	docs: add note about docker DNS config when using bridge mode (#12229 ) The Docker DNS configuration options are not compatible with a group-level network in `bridge` mode. Warn users about this in the Docker task configuration docs.	2022-03-08 11:59:20 -05:00
Merlin Scholz	68457be72c	docs: elaborate on networking issues with firewalld (#12214 )	2022-03-08 09:49:29 -05:00
Ignacio Torres Masdeu	2793054147	docs: fix examples for set_contains_all and set_contains_any (#12093 )	2022-03-07 13:55:57 -05:00
Michael Schurter	7bb8de68e5	Merge pull request #12138 from jorgemarey/f-ns-meta Add metadata to namespaces	2022-03-07 10:19:33 -08:00
Tim Gross	b94837a2b8	csi: add pagination args to `volume snapshot list` (#12193 ) The snapshot list API supports pagination as part of the CSI specification, but we didn't have it plumbed through to the command line.	2022-03-07 12:19:28 -05:00
Tim Gross	09a7612150	csi: volume snapshot list plugin option is required (#12197 ) The RPC for listing volume snapshots requires a plugin ID. Update the `volume snapshot list` command to find the specific plugin from the provided prefix.	2022-03-07 09:58:29 -05:00
Michael Schurter	69913d6ac5	docs: add meta to namespace docs	2022-03-04 14:18:57 -08:00
James Rasell	6aa741dd16	docs: add note regarding HCLv2 func and interpolation.	2022-03-04 12:06:25 +01:00
Michael Schurter	0f6923c750	Merge pull request #10808 from hashicorp/f-curl cli: add operator api command	2022-03-02 10:12:16 -08:00
Michael Schurter	a8833b7d86	docs: add op api examples	2022-03-01 17:15:26 -08:00
Michael Schurter	72134ef5a7	docs: add op api examples	2022-03-01 17:12:58 -08:00
Michael Schurter	fcf4515875	docs: add op api options	2022-03-01 16:43:53 -08:00
Ashlee M Boyer	c3691a44df	docs: Fixing path for autoscaling/agent/source nav item (#12166 )	2022-03-01 17:24:12 -05:00
Tim Gross	f2a4ad0949	CSI: implement support for topology (#12129 )	2022-03-01 10:15:46 -05:00
Tim Gross	c90e674918	CSI: use HTTP headers for passing CSI secrets (#12144 )	2022-03-01 08:47:01 -05:00
Tim Gross	ca06f6153a	docs: clarify that plugin commands are for CSI only (#12151 )	2022-03-01 07:57:41 -05:00
Jorge Marey	a466f01120	Add metadata to namespaces	2022-02-27 09:09:10 +01:00
Seth Hoenig	5269b2e02f	docs: clairfy advertise.rpc effect The advertise.rpc config option is not intuitive. At first glance you'd assume it works like advertise.http or advertise.serf, but it does not. The current behavior is working as intended, but the documentation is very hard to parse and doesn't draw a clear picture of what the setting actually does. Closes https://github.com/hashicorp/nomad/issues/11075	2022-02-25 16:02:29 -06:00
Michael Schurter	bb3daac628	rename `nomad curl` to `nomad operator api`	2022-02-24 15:52:54 -08:00
Michael Schurter	141db0c562	cli: add curl command Just a hackweek project at this point.	2022-02-24 15:52:54 -08:00
Sander Mol	42b338308f	add go-sockaddr templating support to nomad consul address (#12084 )	2022-02-24 09:34:54 -05:00
Florian Apolloner	3bced8f558	namespaces: allow enabling/disabling allowed drivers per namespace	2022-02-24 09:27:32 -05:00
Seth Hoenig	8e6d97744b	docs: emphasize snapshot before upgrading	2022-02-24 08:22:41 -06:00
Seth Hoenig	de95998faa	core: switch to go.etc.io/bbolt This PR swaps the underlying BoltDB implementation from boltdb/bolt to go.etc.io/bbolt. In addition, the Server has a new configuration option for disabling NoFreelistSync on the underlying database. Freelist option: https://github.com/etcd-io/bbolt/blob/master/db.go#L81 Consul equivelent PR: https://github.com/hashicorp/consul/pull/11720	2022-02-23 14:26:41 -06:00
Tim Gross	246db87a74	CSI: allow for concurrent plugin allocations (#12078 ) The dynamic plugin registry assumes that plugins are singletons, which matches the behavior of other Nomad plugins. But because dynamic plugins like CSI are implemented by allocations, we need to handle the possibility of multiple allocations for a given plugin type + ID, as well as behaviors around interleaved allocation starts and stops. Update the data structure for the dynamic registry so that more recent allocations take over as the instance manager singleton, but we still preserve the previous running allocations so that restores work without racing. Multiple allocations can run on a client for the same plugin, even if only during updates. Provide each plugin task a unique path for the control socket so that the tasks don't interfere with each other.	2022-02-23 15:23:07 -05:00
Mike Nomitch	f3d1cf4dbd	Merge pull request #12065 from hashicorp/docs-add-form-link Adding link to interview form	2022-02-22 11:05:20 -08:00
Luiz Aoqui	02ee075506	docs: update link to `mount` in Docker task driver (#12101 )	2022-02-22 13:39:49 -05:00
Michael Schurter	7494a0c4fd	core: remove all traces of unused protocol version Nomad inherited protocol version numbering configuration from Consul and Serf, but unlike those projects Nomad has never used it. Nomad's `protocol_version` has always been `1`. While the code is effectively unused and therefore poses no runtime risks to leave, I felt like removing it was best because: 1. Nomad's RPC subsystem has been able to evolve extensively without needing to increment the version number. 2. Nomad's HTTP API has evolved extensively without increment `API{Major,Minor}Version`. If we want to version the HTTP API in the future, I doubt this is the mechanism we would choose. 3. The presence of the `server.protocol_version` configuration parameter is confusing since `server.raft_protocol` is an important parameter for operators to consider. Even more confusing is that there is a distinct Serf protocol version which is included in `nomad server members` output under the heading `Protocol`. `raft_protocol` is the only protocol version relevant to Nomad developers and operators. The other protocol versions are either deadcode or have never changed (Serf). 4. If we were to need to version the RPC, HTTP API, or Serf protocols, I don't think these configuration parameters and variables are the best choice. If we come to that point we should choose a versioning scheme based on the use case and modern best practices -- not this 6+ year old dead code.	2022-02-18 16:12:36 -08:00
Luiz Aoqui	110dbeeb9d	Add `go-bexpr` filters to evals and deployment list endpoints (#12034 )	2022-02-16 11:40:30 -05:00
Tiernan	c30b4617aa	interpolate network.dns block on client (#12021 )	2022-02-16 08:39:44 -05:00
Mike Nomitch	8377f5cfe3	Adding link to interview form	2022-02-14 12:38:26 -08:00
James Rasell	926458c5b2	Merge pull request #12053 from marcaurele/fix-typo doc(typo): technical typo in advertised example	2022-02-11 14:27:12 +01:00
Luiz Aoqui	d976e4a19b	docs: add upgrade note and ACL requirements for the job submit endpoint (#12046 )	2022-02-10 15:35:16 -05:00
Marc-Aurèle Brothier	fb80dc57a1	small typo in advertised example	2022-02-10 13:53:05 +01:00
Tim Gross	59c8558969	docs and changelog for `nomad config validate` (#12031 )	2022-02-09 10:20:45 -05:00
Tim Gross	7ad15b2b42	raft: default to protocol v3 (#11572 ) Many of Nomad's Autopilot features require raft protocol version 3. Set the default raft protocol to 3, and improve the upgrade documentation.	2022-02-03 15:03:12 -05:00
James Rasell	a7f569d0e1	docs: add `cores` to client reserved config block.	2022-01-26 15:56:16 +01:00
Dan Norris	160682cf2b	docs: Update volume create/register mount options to use []string example (#11912 ) The examples for `nomad volume create` and `nomad volume register` are not setting `mount_flags` using an array of strings. This fixes the issue by changing the example to be `mount_flags = ["noatime"]`.	2022-01-24 11:34:21 -05:00
Luiz Aoqui	626e633b41	docs: add `nomad.plan.node_rejected` metric (#11860 )	2022-01-18 13:47:20 -05:00
Dave May	330d24a873	cli: Add event stream capture to nomad operator debug (#11865 )	2022-01-17 21:35:51 -05:00
Luiz Aoqui	ed9f277925	docs: update 1.2.0 upgrade note now that the UI ACL is fixed (#11840 )	2022-01-17 11:09:08 -05:00
James Rasell	82b168bf34	Merge pull request #11403 from hashicorp/f-gh-11059 agent/docs: add better clarification when top-level data dir needs setting	2022-01-13 16:41:35 +01:00
Derek Strickland	0a8e03f0f7	Expose Consul template configuration parameters (#11606 ) This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza. - `wait` It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza. - `max_stale` - `block_query_wait` - `consul_retry` - `vault_retry` - `wait` Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure. - `wait_bounds` Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-01-10 10:19:07 -05:00
Tim Gross	fa64822e49	docs: note that clients need to have ACLs enabled (#11799 ) Client endpoints such as `alloc exec` are enforced on the client if the API client or CLI has "line of sight" to the client. This is already in the Learn guide but having it in the ACL configuration docs would be helpful.	2022-01-07 16:18:41 -05:00
Tim Gross	32f150d469	docs: new scheduler metrics (#11790 ) * Fixed name of `nomad.scheduler.allocs.reschedule` metric * Added new metrics to metrics reference documentation * Expanded definitions of "waiting" metrics * Changelog entry for #10236 and #10237	2022-01-07 09:51:15 -05:00
James Rasell	1f4e100edc	Merge pull request #11762 from hashicorp/b-gh-11681 docs: add 1.2.0 HCLv2 strict parsing upgrade note.	2022-01-04 09:30:09 +01:00
Tim Gross	6b1b3e7ef8	docs: fix attribute name for java version detection (#11764 )	2022-01-03 16:50:25 -05:00
James Rasell	117c79117e	docs: add 1.2.0 HCLv2 strict parsing upgrade note.	2022-01-03 15:41:18 +00:00
Tim Gross	2806dc2bd7	docs/tests for multiple HTTP address config (#11760 )	2022-01-03 10:17:13 -05:00
Kevin Schoonover	5d9a506bc0	agent: support multiple http address in addresses.http (#11582 )	2022-01-03 09:33:53 -05:00
Shishir	65eab35412	Add support for setting pids_limit in docker plugin config. (#11526 )	2021-12-21 13:31:34 -05:00
Tim Gross	b0c3b99b03	scheduler: fix quadratic performance with spread blocks (#11712 ) When the scheduler picks a node for each evaluation, the `LimitIterator` provides at most 2 eligible nodes for the `MaxScoreIterator` to choose from. This keeps scheduling fast while producing acceptable results because the results are binpacked. Jobs with a `spread` block (or node affinity) remove this limit in order to produce correct spread scoring. This means that every allocation within a job with a `spread` block is evaluated against _all_ eligible nodes. Operators of large clusters have reported that jobs with `spread` blocks that are eligible on a large number of nodes can take longer than the nack timeout to evaluate (60s). Typical evaluations are processed in milliseconds. In practice, it's not necessary to evaluate every eligible node for every allocation on large clusters, because the `RandomIterator` at the base of the scheduler stack produces enough variation in each pass that the likelihood of an uneven spread is negligible. Note that feasibility is checked before the limit, so this only impacts the number of _eligible_ nodes available for scoring, not the total number of nodes. This changeset sets the iterator limit for "large" `spread` block and node affinity jobs to be equal to the number of desired allocations. This brings an example problematic job evaluation down from ~3min to ~10s. The included tests ensure that we have acceptable spread results across a variety of large cluster topologies.	2021-12-21 10:10:01 -05:00
Andy Assareh	8ba4e063e2	Mesh Gateway doc enhancements (#11354 ) * Mesh Gateway doc enhancements 1. I believe this line should be corrected to add mesh as one of the choices 2. I found that we are not setting this meta, and it is a required element for wan federation. I believe it would be helpful and potentially time saving to note that right here.	2021-12-20 17:10:44 -05:00
Luiz Aoqui	a46d799f2a	docs: add v1.2.0 upgrade guide about Nomad UI ACL change for job details page (#11689 )	2021-12-16 14:32:20 -05:00
Luiz Aoqui	4b39494cd1	docs: add more references and examples to the `template` block (#11691 )	2021-12-16 14:14:01 -05:00
Tim Gross	f2615992a4	cli: unhide advanced operator raft debugging commands (#11682 ) The `nomad operator raft` and `nomad operator snapshot state` subcommands for inspecting on-disk raft state were hidden and undocumented. Expose and document these so that advanced operators have support for these tools.	2021-12-16 10:32:11 -05:00
Tim Gross	536e3c5282	`nomad eval list` command (#11675 ) Use the new filtering and pagination capabilities of the `Eval.List` RPC to provide filtering and pagination at the command line. Also includes note that `nomad eval status -json` is deprecated and will be replaced with a single evaluation view in a future version of Nomad.	2021-12-15 11:58:38 -05:00
Noel Quiles	235a778a56	website: Copy updates (#11677 )	2021-12-14 16:35:21 -05:00
Tim Gross	a0cf5db797	provide `-no-shutdown-delay` flag for job/alloc stop (#11596 ) Some operators use very long group/task `shutdown_delay` settings to safely drain network connections to their workloads after service deregistration. But during incident response, they may want to cause that drain to be skipped so they can quickly shed load. Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and `nomad job stop` commands that bypasses the delay. This sets a new desired transition state on the affected allocations that the allocation/task runner will identify during pre-kill on the client. Note (as documented here) that using this flag will almost always result in failed inbound network connections for workloads as the tasks will exit before clients receive updated service discovery information and won't be gracefully drained.	2021-12-13 14:54:53 -05:00
Kevin Wang	3e6757f211	feat(website): extract `/plugins` `/tools` docs (#11584 ) Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> Co-authored-by: Mike Nomitch <mnomitch@hashicorp.com>	2021-12-09 14:25:18 -05:00
Lukas W	0e5958d671	CLI: Return non-zero exit code when deployment fails in `nomad run` (#11550 ) * Exit non-zero from run command if deployment fails * Fix typo in deployment monitor introduced in 0edda11	2021-12-09 09:09:28 -05:00
Tim Gross	348f482c94	docs: improve docs for troubleshooting and monitoring scheduler (#11623 ) This changeset adds more specific recommendations as to what metrics to monitor, and what resources should be examined during incident response. It also renames the "Telemetry" section to "Monitoring Nomad" to surface the material better and distinguish it from the "Metric Reference". Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>	2021-12-07 15:52:13 -05:00
James Rasell	d44e5620dd	docs: add license expiry metric to metrics website doc.	2021-12-07 10:31:51 +00:00
Shantanu Gadgil	0838678609	mention `sysbatch` in addition to `batch` (#11587 )	2021-12-06 19:12:03 -05:00
Tim Gross	03e697a69d	scheduler: config option to reject job registration (#11610 ) During incident response, operators may find that automated processes elsewhere in the organization can be generating new workloads on Nomad clusters that are unable to handle the workload. This changeset adds a field to the `SchedulerConfiguration` API that causes all job registration calls to be rejected unless the request has a management ACL token.	2021-12-06 15:20:34 -05:00
Tim Gross	39acac33a0	ui: change Consul/Vault base URL field name (#11589 ) Give ourselves some room for extension in the UI configuration block by naming the field `ui_url`, which will let us have an `api_url`. Fix the template path to ensure we're getting the right value from the API.	2021-11-30 13:20:29 -05:00
Tim Gross	ba038a1ebc	docs: `mount_flags` takes a slice of strings (#11583 ) The `mount_flags` option takes a slice of strings, not a comma-separated string like the flags passed to `mount(8)`.	2021-11-29 10:07:34 -05:00
Luiz Aoqui	0b82d62bc6	docs: document new Prometheus configuration for the Autoscaler APM plugin (#11562 )	2021-11-24 17:37:35 -05:00
Luiz Aoqui	0859eac724	docs: add CLI and config docs for the Autoscaler policy source config (#11559 )	2021-11-24 16:17:37 -05:00
Luiz Aoqui	fa23106612	docs: add upgrade guide notes for Nomad 1.2.2 (#11567 )	2021-11-24 14:24:20 -05:00
Tim Gross	fcb96de9a7	config: UI configuration block with Vault/Consul links (#11555 ) Add `ui` block to agent configuration to enable/disable the web UI and provide the web UI with links to Vault/Consul.	2021-11-24 11:20:02 -05:00
James Rasell	6dddf9a1fb	Merge pull request #11535 from hashicorp/docs-vault-token docs: clarify vault.token only required on servers	2021-11-23 09:26:06 +01:00
James Rasell	751c8217d1	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
Luiz Aoqui	d3c1a03edd	Version 1.2.1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJhl94SAAoJELC0QQl2hbZ2pqoP/R7HyOxvealo5MBJcG4mGiWT Hsu9VXpYKDWn0GSXd3JmqYWH7tIwFMXispZ7pMlDLieypW3UpMYIbIquaePxOaRL yhlc0CLT7JDsFPx8Puv1fgKXaS3EfFyJlYx437bhCQ+K0k2+1n3EOhrzU/DQ4j8V D5qxlkZh6IK6brIJ54NivGzTxtzGGvIGXCrDPolX3cwoBtyO/pbecfEkRlN2xwxl P68l52+Jit3lK2Cljh4Kr1qFj8voHPjYUTXGas8ZkIVrx9l4fb6CHib2y3hy4bRR qwXT4keWc8bxtLQ7vtetGBAXp4UKJigziE4imhHAttBN9th2/Oy0qSQCNX3xELJC Jwgc+N+ON63QI2sP/8FWvmeUrJpASRITYl/Gr8uOR6n1PacrBhFT9OV4VMkte1ua jS/WF/7k21NZYqZca+thvN12wmw/gSEAEeCHH5kR3vPLeV6FdanhKLjufMNuMShc UKJCEZw1/Lyux1XkLqMPoZ4DCak8/HskupQoLNsekF1Uki8ObU4as7GERedxqkj6 i2+1QIQMqvviskOwT0QOWm4RFXjRQsIK8uUfXzHHWDMzDhvnGjB0eWVMLAj4/rTe 46yUP4kdarFkxwkDmLEyoogdD35wC4Xc8Y8IynzUTN77pOWID5QEyFZVaaBB4NR3 wNowUJGrNkxEYXwGSkjh =Zuw2 -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmGbu3sACgkQKfRZwNnL tXMx4BAAksQ07tSoOku8zDwx2JpoiNApoYhMLlfJ4S3Mw+RYtbayAMRyA08GG56I U85XJB/Z2CzliYL/Nya1e3z6Gyn92V0iD9u7N1xEAPt8PdyiXqIBZn1rWoiCcnMO C3f2aRGhLZMVOZG0v7fgbh1PkhJt4MLcRQE9nn5ojPvFzW9bL0Iz7lc9IxHQtaU0 rANDcXdj3IhiOdEgjtO++Qhdeu3t2SBhT2xFnlJ3gXC2q/aY1a2C7BYdlSxtw0JU nKpxvBTsB7rINGcYxhXZlckui5YLL4BX11XqsYhUTMC+33vxE5HNty1ANc1+SNyO 0iHp0yc5J6MCLuiZ/2sBek2tC+KHCufb+qEIqPmBpcWPJRT8HjginLxj/HyL2TQc pLF9XxhYKvv0sm3Zr3Ima5kqWgayph3XhQ73hKs9f7SLfErr6qr4XaI8egZA4OTG 0QGmY/61UlAdsz5tUvIGRWYD5rqXyXIYnUprldPSQdeZ0o2GjX7T0GZ934O5uHfE Ne73GafGn8JaGxH9+AEHMJAVpkrzWR1wrExL3kGJ8NF40HlsYofIuhTkZqMKX3EH 7KfefSJW1NQAGeAEwjtvzhmUiM0cVoCWGd4COxX1G3oJ0o8gZ3RklDEA4Pa9C0rO pBW/KIckPpGieGvPaA3mqmXDjx6oOaxPi9wd5TniBHh43pgrASo= =KVce -----END PGP SIGNATURE----- Merge tag 'v1.2.1' into merge-release-1.2.1-branch Version 1.2.1	2021-11-22 10:47:04 -05:00
Tim Gross	fc1d4814d9	qemu: add `args_allowlist` to sandbox VM command line inputs The QEMU driver allows arbitrary command line options, but many of these options give access to host resources that operators may not want to expose such as devices. Add an optional allowlist to the plugin configuration so that operators can limit the resources for QEMU.	2021-11-19 11:11:52 -05:00

1 2 3 4 5 ...

520 Commits