open-nomad

Author	SHA1	Message	Date
James Rasell	c55efdd928	docs: add OIDC login API and CLI docs. (#15818 )	2023-01-20 10:07:26 +01:00
Ashlee M Boyer	c75ea79f25	Fixing yaml syntax in frontmatter (#15781 )	2023-01-13 14:06:46 -05:00
Dao Thanh Tung	09b25d71b8	cli: Add a nomad operator client state command (#15469 ) Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>	2023-01-11 10:03:31 -05:00
dgotlieb	a991342f8d	docs: nomad `eval delete` typo fix (#15667 ) Status instead of Stauts	2023-01-03 14:18:03 -05:00
Piotr Kazmierczak	f1450d25d2	ACL Binding Rules CLI documentation (#15584 )	2022-12-22 16:36:25 +01:00
Danish Prakash	dc81568f93	command/job_stop: accept multiple jobs, stop concurrently (#12582 ) * command/job_stop: accept multiple jobs, stop concurrently Signed-off-by: danishprakash <grafitykoncept@gmail.com> * command/job_stop_test: add test for multiple job stops Signed-off-by: danishprakash <grafitykoncept@gmail.com> * improve output, add changelog and docs Signed-off-by: danishprakash <grafitykoncept@gmail.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-12-16 15:46:58 -08:00
Piotr Kazmierczak	f91ab03920	acl: SSO auth methods CLI documentation (#15538 ) This PR provides documentation for the ACL Auth Methods CLI commands. Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2022-12-14 13:35:26 +01:00
Tim Gross	7404ef46e9	docs: update `plugin status` docs with capabilities and topology (#15448 ) The `plugin status` command supports displaying CSI capabilities and topology accessibility, but this was missing from the documentation. Extend the `-verbose` example to show that info.	2022-12-01 12:18:56 -05:00
Jack	62f7de7ed5	cli: `wait` flag for use with `deployment status -monitor` (#15262 )	2022-11-23 16:36:13 -05:00
Lance Haig	0263e7af34	Add command "nomad tls" (#14296 )	2022-11-22 14:12:07 -05:00
Tim Gross	37134a4a37	eval delete: move batching of deletes into RPC handler and state (#15117 ) During unusual outage recovery scenarios on large clusters, a backlog of millions of evaluations can appear. In these cases, the `eval delete` command can put excessive load on the cluster by listing large sets of evals to extract the IDs and then sending larges batches of IDs. Although the command's batch size was carefully tuned, we still need to be JSON deserialize, re-serialize to MessagePack, send the log entries through raft, and get the FSM applied. To improve performance of this recovery case, move the batching process into the RPC handler and the state store. The design here is a little weird, so let's look a the failed options first: * A naive solution here would be to just send the filter as the raft request and let the FSM apply delete the whole set in a single operation. Benchmarking with 1M evals on a 3 node cluster demonstrated this can block the FSM apply for several minutes, which puts the cluster at risk if there's a leadership failover (the barrier write can't be made while this apply is in-flight). * A less naive but still bad solution would be to have the RPC handler filter and paginate, and then hand a list of IDs to the existing raft log entry. Benchmarks showed this blocked the FSM apply for 20-30s at a time and took roughly an hour to complete. Instead, we're filtering and paginating in the RPC handler to find a page token, and then passing both the filter and page token in the raft log. The FSM apply recreates the paginator using the filter and page token to get roughly the same page of evaluations, which it then deletes. The pagination process is fairly cheap (only abut 5% of the total FSM apply time), so counter-intuitively this rework ends up being much faster. A benchmark of 1M evaluations showed this blocked the FSM apply for 20-30ms at a time (typical for normal operations) and completes in less than 4 minutes. Note that, as with the existing design, this delete is not consistent: a new evaluation inserted "behind" the cursor of the pagination will fail to be deleted.	2022-11-14 14:08:13 -05:00
Douglas Jose	345ef0bbec	Fix wrong reference to `vault` (#15228 )	2022-11-14 10:49:09 +01:00
Kevin Wang	d66b2eba43	fix: website broken links (#14904 ) * fix: website broken links * fix up keyring-rotate link Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-10-17 11:32:10 -04:00
Damian Czaja	95f969c4bf	cli: add `nomad fmt` (#14779 )	2022-10-06 17:00:29 -04:00
Giovani Avelar	a625de2062	Allow specification of a custom job name/prefix for parameterized jobs (#14631 )	2022-10-06 16:21:40 -04:00
Derek Strickland	2c4df95e92	Merge pull request #14664 from hashicorp/docs-multiregion-dispatch multiregion: Added a section for multiregion parameterized job dispatch	2022-09-28 15:40:11 -04:00
Derek Strickland	c3d4496287	link from dispatch command	2022-09-28 08:30:22 -04:00
Tim Gross	62b1e2ef97	variables: document restrictions on path and size (#14687 )	2022-09-26 11:40:53 -04:00
Bryce Kalow	a84d2de9be	website: content updates for developer (#14473 ) Co-authored-by: Geoffrey Grosenbach <26+topfunky@users.noreply.github.com> Co-authored-by: Anthony <russo555@gmail.com> Co-authored-by: Ashlee Boyer <ashlee.boyer@hashicorp.com> Co-authored-by: Ashlee M Boyer <43934258+ashleemboyer@users.noreply.github.com> Co-authored-by: HashiBot <62622282+hashibot-web@users.noreply.github.com> Co-authored-by: Kevin Wang <kwangsan@gmail.com>	2022-09-16 10:38:39 -05:00
Tim Gross	357e7f4521	docs: include path in ACL requirements for variables (#14561 ) Also add links to the ACL policy reference and variables concepts docs near the top of the page.	2022-09-13 10:21:29 -04:00
Charlie Voiselle	8eb1689fca	Variables CLI documentation (#14249 )	2022-09-12 16:44:31 -04:00
Tim Gross	9259a373cd	remove root keyring install API (#14514 ) * keyring rotate API should require put/post method * remove keyring install API	2022-09-09 08:50:35 -04:00
Luiz Aoqui	94d7dddccd	cli: set -hcl2-strict to false if -hcl1 is defined (#14426 ) These options are mutually exclusive but, since `-hcl2-strict` defaults to `true` users had to explicitily set it to `false` when using `-hcl1`. Also return `255` when job plan fails validation as this is the expected code in this situation.	2022-09-01 10:42:08 -04:00
James Rasell	986355bcd9	docs: add documentation for ACL token expiration and ACL roles. (#14332 ) The ACL command docs are now found within a sub-dir like the operator command docs. Updates to the ACL token commands to accommodate token expiry have also been added. The ACL API docs are now found within a sub-dir like the operator API docs. The ACL docs now include the ACL roles endpoint as well as updated ACL token endpoints for token expiration. The configuration section is also updated to accommodate the new ACL and server parameters for the new ACL features.	2022-08-31 16:13:47 +02:00
Tim Gross	d7652fdd3a	docs: rename Secure Variables to Variables (#14352 )	2022-08-29 11:37:08 -04:00
Luiz Aoqui	e012d9411e	Task lifecycle restart (#14127 ) * allocrunner: handle lifecycle when all tasks die When all tasks die the Coordinator must transition to its terminal state, coordinatorStatePoststop, to unblock poststop tasks. Since this could happen at any time (for example, a prestart task dies), all states must be able to transition to this terminal state. * allocrunner: implement different alloc restarts Add a new alloc restart mode where all tasks are restarted, even if they have already exited. Also unifies the alloc restart logic to use the implementation that restarts tasks concurrently and ignores ErrTaskNotRunning errors since those are expected when restarting the allocation. * allocrunner: allow tasks to run again Prevent the task runner Run() method from exiting to allow a dead task to run again. When the task runner is signaled to restart, the function will jump back to the MAIN loop and run it again. The task runner determines if a task needs to run again based on two new task events that were added to differentiate between a request to restart a specific task, the tasks that are currently running, or all tasks that have already run. * api/cli: add support for all tasks alloc restart Implement the new -all-tasks alloc restart CLI flag and its API counterpar, AllTasks. The client endpoint calls the appropriate restart method from the allocrunner depending on the restart parameters used. * test: fix tasklifecycle Coordinator test * allocrunner: kill taskrunners if all tasks are dead When all non-poststop tasks are dead we need to kill the taskrunners so we don't leak their goroutines, which are blocked in the alloc restart loop. This also ensures the allocrunner exits on its own. * taskrunner: fix tests that waited on WaitCh Now that "dead" tasks may run again, the taskrunner Run() method will not return when the task finishes running, so tests must wait for the task state to be "dead" instead of using the WaitCh, since it won't be closed until the taskrunner is killed. * tests: add tests for all tasks alloc restart * changelog: add entry for #14127 * taskrunner: fix restore logic. The first implementation of the task runner restore process relied on server data (`tr.Alloc().TerminalStatus()`) which may not be available to the client at the time of restore. It also had the incorrect code path. When restoring a dead task the driver handle always needs to be clear cleanly using `clearDriverHandle` otherwise, after exiting the MAIN loop, the task may be killed by `tr.handleKill`. The fix is to store the state of the Run() loop in the task runner local client state: if the task runner ever exits this loop cleanly (not with a shutdown) it will never be able to run again. So if the Run() loops starts with this local state flag set, it must exit early. This local state flag is also being checked on task restart requests. If the task is "dead" and its Run() loop is not active it will never be able to run again. * address code review requests * apply more code review changes * taskrunner: add different Restart modes Using the task event to differentiate between the allocrunner restart methods proved to be confusing for developers to understand how it all worked. So instead of relying on the event type, this commit separated the logic of restarting an taskRunner into two methods: - `Restart` will retain the current behaviour and only will only restart the task if it's currently running. - `ForceRestart` is the new method where a `dead` task is allowed to restart if its `Run()` method is still active. Callers will need to restart the allocRunner taskCoordinator to make sure it will allow the task to run again. * minor fixes	2022-08-24 17:43:07 -04:00
Tim Gross	bf57d76ec7	allow ACL policies to be associated with workload identity (#14140 ) The original design for workload identities and ACLs allows for operators to extend the automatic capabilities of a workload by using a specially-named policy. This has shown to be potentially unsafe because of naming collisions, so instead we'll allow operators to explicitly attach a policy to a workload identity. This changeset adds workload identity fields to ACL policy objects and threads that all the way down to the command line. It also a new secondary index to the ACL policy table on namespace and job so that claim resolution can efficiently query for related policies.	2022-08-22 16:41:21 -04:00
Seth Hoenig	ba5c45ab93	cli: respect vault token in plan command This PR fixes a regression where the 'job plan' command would not respect a Vault token if set via --vault-token or $VAULT_TOKEN. Basically the same bug/fix as for the validate command in https://github.com/hashicorp/nomad/issues/13062 Fixes https://github.com/hashicorp/nomad/issues/13939	2022-08-11 08:54:08 -05:00
Charlie Voiselle	9a19279f59	Sweep of docs for repeated words; minor edits (#14032 )	2022-08-05 16:45:30 -04:00
Tim Gross	96aea74b4b	docs: keyring commands (#13690 ) Document the secure variables keyring commands, document the aliased gossip keyring commands, and note that the old gossip keyring commands are deprecated.	2022-07-20 14:14:10 -04:00
Tim Gross	fc4cd53cfb	docs: rename Internals to Concepts (#13696 )	2022-07-11 16:55:33 -04:00
Tim Gross	d49ff0175c	docs: move operator subcommands under their own trees (#13677 ) The sidebar navigation tree for the `operator` sub-sub commands is getting cluttered and we have a new set of commands coming to support secure variables keyring as well. Move these all under their own subtrees.	2022-07-11 14:00:24 -04:00
Luiz Aoqui	03433dd8af	cli: improve output of eval commands (#13581 ) Use the same output format when listing multiple evals in the `eval list` command and when `eval status <prefix>` matches more than one eval. Include the eval namespace in all output formats and always include the job ID in `eval status` since, even `node-update` evals are related to a job. Add Node ID to the evals table output to help differentiate `node-update` evals. Co-authored-by: James Rasell <jrasell@hashicorp.com>	2022-07-07 13:13:34 -04:00
James Rasell	0c0b028a59	core: allow deleting of evaluations (#13492 ) * core: add eval delete RPC and core functionality. * agent: add eval delete HTTP endpoint. * api: add eval delete API functionality. * cli: add eval delete command. * docs: add eval delete website documentation.	2022-07-06 16:30:11 +02:00
James Rasell	181b247384	core: allow pausing and un-pausing of leader broker routine (#13045 ) * core: allow pause/un-pause of eval broker on region leader. * agent: add ability to pause eval broker via scheduler config. * cli: add operator scheduler commands to interact with config. * api: add ability to pause eval broker via scheduler config * e2e: add operator scheduler test for eval broker pause. * docs: include new opertor scheduler CLI and pause eval API info.	2022-07-06 16:13:48 +02:00
Luiz Aoqui	6598567725	docs: create volume spec page (#13353 ) In addition to jobs, there are other objects in Nomad that have a specific format and can be provided to commands and API endpoints. This commit creates a new menu section to hold the specification for volumes and update the command pages to point to the new centralized definition. Redirecting the previous entries is not possible with `redirect.js` because they are done server-side and URL fragments are not accessible to detect a match. So we provide hidden anchors with a link to the new page to guide users towards the new documentation. Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-06-14 14:08:25 -04:00
Michael Schurter	f41ea0e5dc	docs: explain behavior of system gc command (#13342 )	2022-06-13 09:54:23 +02:00
Lance Haig	4bf27d743d	Allow Operator Generated bootstrap token (#12520 )	2022-06-03 07:37:24 -04:00
PinkLolicorn	83dd9e801e	docs: `mount_flags` takes a slice of strings (#13087 ) The description of `mount_flags` provides incorrect example of the accepted value format. This fixes the issue by changing the example from a string `ro,noatime` to a slice of strings `["ro", "noatime"]`.	2022-05-20 09:16:17 -04:00
Seth Hoenig	fc58f4972c	cli: correctly use and validate job with vault token set This PR fixes `job validate` to respect '-vault-token', '$VAULT_TOKEN', '-vault-namespace' if set.	2022-05-19 12:13:34 -05:00
Seth Hoenig	65f7abf2f4	cli: update default redis and use nomad service discovery Closes #12927 Closes #12958 This PR updates the version of redis used in our examples from 3.2 to 7. The old version is very not supported anymore, and we should be setting a good example by using a supported version. The long-form example job is now fixed so that the service stanza uses nomad as the service discovery provider, and so now the job runs without a requirement of having Consul running and configured.	2022-05-17 10:24:19 -05:00
Tim Gross	45b238ec82	CSI: node drain should end once only plugins remain (#12846 ) In #12324 we made it so that plugins wait until the node drain is complete, as we do for system jobs. But we neglected to mark the node drain as complete once only plugins (or system jobs) remaining, which means that the node drain is left in a draining state until the `deadline` time expires. This was incorrectly documented as expected behavior in #12324.	2022-05-03 10:20:22 -04:00
Matus Goljer	a741cc76b5	nomad can also install autocomplete for fish shell (#12834 )	2022-05-02 09:26:55 -04:00
Tim Gross	d06ad50538	docs: clarify `capacity_min/max` for volumes (#12825 ) The capacity fields for `create volume` set bounds on the resulting size of the volume, but the ultimate size of the volume will be determined by the storage provider (between the min and max). Clarify this in the documentation and provide a suggestion for how to set a exact size.	2022-04-29 13:38:30 -04:00
Michael Schurter	1256c8ef66	docs: update json jobs docs (#12766 ) * docs: update json jobs docs Did you know that Nomad has not 1 but 2 JSON formats for jobs? 2½ if you want to acknowledge that sometimes our JSON job representations have a Job top-level wrapper and sometimes do not. The 2½ formats are: ``` 1. HCL JSON 2. Input API JSON (top-level Job field) 2.5. Output API JSON (lacks top-level Job field) ``` `#2` is what our docs consider our API JSON. `#2.5` seems to be an accident of history we can't fix with breaking API compatibility. `#1` is an even more interesting accident of history: the `jobspec2` package automatically detects if the input to Parse is JSON and switches to a JSON parser. This behavior is undocumented, the format is unspecified, and there is no official HashiCorp tooling to produce this JSON from HCL. The plot thickens when you discover popular third party tools like hcl2json.com and https://github.com/tmccombs/hcl2json seem to produce JSON that `nomad run` accepts! Since we have no telemetry around whether or not anyone passes HCL JSON to `nomad run`, and people don't file bugs around features that Just Work, I'm choosing to leave that code path in place and acknowledged but not suggested in documentation. See https://github.com/hashicorp/hcl/issues/498 for a more comprehensive discussion of what officially supporting HCL JSON in Nomad would look like. (I also added some of the missing fields to the (Input API flavor) JSON Job documentation, but it still needs a lot of work to be comprehensive.) Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-04-22 15:57:27 -07:00
Luiz Aoqui	a8cc633156	vault: revert support for entity aliases (#12723 ) After a more detailed analysis of this feature, the approach taken in PR #12449 was found to be not ideal due to poor UX (users are responsible for setting the entity alias they would like to use) and issues around jobs potentially masquerading itself as another Vault entity.	2022-04-22 10:46:34 -04:00
James Rasell	046831466c	cli: add pagination flags to service info command. (#12730 )	2022-04-22 10:32:40 +02:00
Michael Schurter	5db3a671db	cli: add -json flag to support job commands (#12591 ) * cli: add -json flag to support job commands While the CLI has always supported running JSON jobs, its support has been via HCLv2's JSON parsing. I have no idea what format it expects the job to be in, but it's absolutely not in the same format as the API expects. So I ignored that and added a new -json flag to explicitly support API style JSON jobspecs. The jobspecs can even have the wrapping {"Job": {...}} envelope or not! * docs: fix example for `nomad job validate` We haven't been able to validate inside driver config stanzas ever since the move to task driver plugins. 😭	2022-04-21 13:20:36 -07:00
Tim Gross	f4287c870d	cli: detect directory when applying namespace spec file (#12738 ) The new `namespace apply` feature that allows for passing a namespace specification file detects the difference between an empty namespace and a namespace specification by checking if the file exists. For most cases, the file will have an extension like `.hcl` and so there's little danger that a user will apply a file spec when they intended to apply a file name. But because directory names typically don't include an extension, you're much more likely to collide when trying to `namespace apply` by name only, and then you get a confusing error message of the form: Failed to read file: read $namespace: is a directory Detect the case where the namespace name collides with a directory in the current working directory, and skip trying to load the directory.	2022-04-21 14:53:45 -04:00
James Rasell	716b8e658b	api: Add support for filtering and pagination to the node list endpoint (#12727 )	2022-04-21 17:04:33 +02:00

1 2 3

130 commits