open-nomad

Author	SHA1	Message	Date
Tim Gross	b9ca3bc9b1	build: remove 386 builds for Nomad 1.6.0 (#17239 ) The 32-bit Intel builds (aka "386") are not tested and likely have bugs involving platform-sized integers when operated at any non-trivial scale. Remove these builds from the upcoming Nomad 1.6.0 and provide recommendations in the upgrade notes for those users who might have hobbyist boards running 32-bit ARM (this will primarily be the RaspberryPi Zero or older spins of the RaspPi). DO NOT BACKPORT TO 1.5.x OR EARLIER!	2023-05-22 13:27:17 -04:00
Charlie Voiselle	fc313b7f8f	[api] Return a shapely error for unexpected response (#16743 ) * Add UnexpectedResultError to nomad/api This allows users to perform additional status-based behavior by rehydrating the error using `errors.As` inside of consumers.	2023-05-22 11:45:31 -04:00
Lance Haig	568da5918b	cli: tls certs not created with correct SANs (#16959 ) The `nomad tls cert` command did not create certificates with the correct SANs for them to work with non default domain and region names. This changset updates the code to support non default domains and regions in the certificates.	2023-05-22 09:31:56 -04:00
Roberto Hidalgo	2f702a9f11	allow periodic jobs to use workload identity ACL policies (#17018 ) When resolving ACL policies, we were not using the parent ID for the policy lookup for dispatch/periodic jobs, even though the claims were signed for that parent ID. This prevents all calls to the Task API (and other WI-authenticated API calls) from a periodically-dispatched job failing with 403. Fix this by using the parent job ID whenever it's available.	2023-05-22 09:19:16 -04:00
Tim Gross	9838349c23	document which fields can be updated by `volume register` (#17249 ) The `volume register` command can update a small subset of the volume's fields in-place, with some restrictions depending on whether the volume is currently in use. Document these in the `volume register` command docs and the volume specification docs. Fixes: #17247	2023-05-22 09:15:25 -04:00
dependabot[bot]	31a38d750b	build(deps): bump github.com/shoenig/test from 0.6.4 to 0.6.6 in /api (#17178 ) * build(deps): bump github.com/shoenig/test from 0.6.4 to 0.6.5 in /api * deps: update shoenig/test to v0.6.5 * deps: update again to v0.6.6 --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-05-22 07:53:12 -05:00
Phil Renaud	48f5af77ce	Updates static JS/UI assets for upcoming 1.6 release (#17263 )	2023-05-19 19:03:37 -04:00
Phil Renaud	46239e1e80	[ui, deployments] Add status panel to child jobs (#17217 ) * Treated same-route as sub-route and didnt cancel watchers * Adds panel to child jobs and sub-sorts * removed the safety check in module-for-job tests * [ui] Adds status panel to Sysbatch jobs (#17243) * In working out periodic/param child jobs, realized the intersection with sysbatch is high enough that it ought to be worked on now * Further removal of jobclientstatussummary * Explicitly making mocked jobs in no-deployment mode * remove last remnants of job-client-status-summary component * Screwed up my sorting order a few commits ago; this corrects it * noActiveDeployment gonna be the death of me	2023-05-19 15:51:35 -04:00
Tim Gross	4881f2451a	docs: describe the default Workload Identity ACL policy (#17245 ) Workload Identities have an implicit default policy. This policy can't currently be described via HCL because it includes task interpolation for Variables and access to the Services API (which doesn't exist as its own ACL capbility). Describe this in our WI documentation. Fixes: #16277	2023-05-19 11:38:05 -04:00
Tim Gross	15fa0277cf	build: pin semgrep action (#17248 ) The file path in the TSCCR repo for the `returntocorp/semgrep` action was incorrect, so the pinning tool was not able to find the correct entry and it was not pinned in #17238. The repository is fixed in https://github.com/hashicorp/security-tsccr/pull/431	2023-05-19 10:27:51 -04:00
Tim Gross	dbfcfbd623	build: move GitHub actions to versions allowed by prodsec (#17238 ) The `backspace/ember-asset-size` action we're using is unmaintained and has a bunch of vulns in it, so it won't pass security screening (this is a NodeJS action so it has piles of dependencies, 99% of which won't be in use but fails automated screening anyways). Move this to the upstream version. The `machine-learning-apps/pr-comment` action also presents a problem for the ProdSec security screening because it's archived and also runs an external Docker image. Move this to a likely-ok maintained action for now, until we can spare some time to remove this in lieu of something more reasonable that isn't a GitHub action.	2023-05-19 09:07:02 -04:00
Phil Renaud	0b729e4bb9	Fixes to scheduling-filtering-in-ui (#17244 )	2023-05-18 17:38:34 -04:00
Yethal	4073987de3	cli: show leader status in json output of server members (#17138 )	2023-05-18 16:43:57 -04:00
Phil Renaud	7e56ca62d1	[ui] Adds a "Scheduling" filter to the job.allocations page (#17227 ) * Basic filter concept * Make sure NextAllocation gets sent up with allocation stub	2023-05-18 16:24:41 -04:00
Luiz Aoqui	beba92ab36	Post 1.5.5 release (#17241 ) * Generate files for 1.5.5 release * Prepare for next release --------- Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>	2023-05-18 14:06:56 -04:00
Jai	b0b0b22f0f	ui: add option to filter for jobs that are packs (#17226 ) * refact: update job model * refact: update view layer * refact: update test	2023-05-18 12:47:11 -04:00
Bram Vogelaar	3b40f778e5	agent: display node id on start up for servers (#17084 ) Signed-off-by: Bram Vogelaar <bram@attachmentgenie.com>	2023-05-18 11:23:12 -04:00
Tim Gross	fe29cf8b7b	logs: fix `logs.disabled` on Windows (#17199 ) On Windows the executor returns an error when trying to open the `NUL` device when we pass it `os.DevNull` for the stdout/stderr paths. Instead of opening the device, use the discard pipe so that we have platform-specific behavior from the executor itself. Fixes: #17148	2023-05-18 09:14:39 -04:00
James Rasell	96f7c84e4e	variable: fixup metadata copy comment and remove unrequired type. (#17234 )	2023-05-18 13:49:41 +01:00
Phil Renaud	552dc06b1d	[ui] Latest Deployment component removed (#17192 ) * Latest Deployment component removed * Integration test selector update	2023-05-17 16:59:42 -04:00
Mike Nomitch	6df2160e69	docs: add documentation on ephemeral disk and logs (#15829 )	2023-05-17 16:58:11 -04:00
Roman Zipp	edf83f432a	docs: remove unneeded brackets from job specification template docs (#17219 )	2023-05-17 16:45:00 -04:00
Phil Renaud	50a35143c9	[ui, deployments] Fix a bug where watchers on a parent (periodic) job would continue on a child route (#17214 ) * Treated same-route as sub-route and didnt cancel watchers * Changelog	2023-05-17 16:36:15 -04:00
Tim Gross	c64efa0776	build: upgrade deprecated actions syntax (#17222 ) Missed these in the previous pass.	2023-05-17 11:39:55 -04:00
hashicorp-tsccr[bot]	aec3b16085	build: trusted workflow pinning (#16992 ) Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-05-17 10:38:10 -04:00
Tim Gross	5fc63ace0b	scheduler: count implicit spread targets as a single target (#17195 ) When calculating the score in the `SpreadIterator`, the score boost is proportional to the difference between the current and desired count. But when there are implicit spread targets, the current count is the sum of the possible implicit targets, which results in incorrect scoring unless there's only one implicit target. This changeset updates the `propertySet` struct to accept a set of explicit target values so it can detect when a property value falls into the implicit set and should be combined with other implicit values. Fixes: #11823	2023-05-17 10:25:00 -04:00
Tim Gross	710afecf61	build: update deprecated GitHub Actions (#17218 ) Many of the GitHub Actions from the build pipeline are written in a truly ancient version of NodeJS. Upgrade to more recent versions. Remove RelEng from codeowners	2023-05-17 08:57:28 -04:00
Phil Renaud	817b2cab9b	[ui] Remove unnecessary subnav for parent jobs (#17190 ) * Nest job subnav in a parent-check * Move the evaluation tab test to withn an else of the children condition	2023-05-16 16:07:12 -04:00
Tim Gross	2426aae832	scheduler: prevent -Inf in spread scoring (#17198 ) When spread targets have a percent value of zero it's possible for them to return -Inf scoring because of a float divide by zero. This is very hard for operators to debug because the string "-Inf" is returned in the API and that breaks the presentation of debugging data. Most scoring iterators are bracketed to -1/+1, but spread iterators do not so that they can handle greatly unbalanced scoring so we can't simply return a -1 score without generating a score that might be greater than the negative scores set by other spread targets. Instead, track the lowest-seen spread boost and use that as the spread boost for any cases where we'd divide by zero. Fixes: #8863	2023-05-16 16:01:32 -04:00
dependabot[bot]	7a92c7b5ac	build(deps-dev): bump prettier from 2.2.1 to 2.8.8 in /website (#16965 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-16 12:12:53 -05:00
Seth Hoenig	e04ff0d935	client: ignore restart issued to terminal allocations (#17175 ) * client: ignore restart issued to terminal allocations This PR fixes a bug where issuing a restart to a terminal allocation would cause the allocation to run its hooks anyway. This was particularly apparent with group_service_hook who would then register services but then never deregister them - as the allocation would be effectively in a "zombie" state where it is prepped to run tasks but never will. * e2e: add e2e test for alloc restart zombies * cl: tweak text Co-authored-by: Tim Gross <tgross@hashicorp.com> --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-05-16 10:19:41 -05:00
Tim Gross	6814e8e6d9	drivers: make internal `DisableLogCollection` capability public (#17196 ) The `DisableLogCollection` capability was introduced as an experimental interface for the Docker driver in 0.10.4. The interface has been stable and allowing third-party task drivers the same capability would be useful for those drivers that don't need the additional overhead of logmon. This PR only makes the capability public. It doesn't yet add it to the configuration options for the other internal drivers. Fixes: #14636 #15686	2023-05-16 09:16:03 -04:00
Piotr Kazmierczak	fe272c3686	refactor acl.UpsertTokens to avoid unnecessary RPC calls. (#17194 ) New RPC endpoints introduced during OIDC and JWT auth perform unnecessary many RPC calls when they upsert generated ACL tokens, as pointed out by @tgross. This PR moves the common logic from acl.UpsertTokens method into a helper method that contains common logic, and sidesteps authentication, metrics, etc.	2023-05-16 09:31:51 +02:00
Michael Schurter	ab11a96181	remove unused helper/fields package (#17197 ) Has been unused since we switched task drivers to plugins and used hclschema for config in #4936 (v0.9.0-beta1)	2023-05-15 12:10:11 -07:00
dependabot[bot]	d980e0a815	build(deps-dev): bump @hashicorp/platform-content-conformance (#17030 ) Bumps @hashicorp/platform-content-conformance from 0.0.10 to 0.0.11. --- updated-dependencies: - dependency-name: "@hashicorp/platform-content-conformance" dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-15 11:28:03 -04:00
Luiz Aoqui	389212bfda	node pool: initial base work (#17163 ) Implementation of the base work for the new node pools feature. It includes a new `NodePool` struct and its corresponding state store table. Upon start the state store is populated with two built-in node pools that cannot be modified nor deleted: * `all` is a node pool that always includes all nodes in the cluster. * `default` is the node pool where nodes that don't specify a node pool in their configuration are placed.	2023-05-15 10:49:08 -04:00
dependabot[bot]	f49eb3278b	build(deps-dev): bump next from 12.3.1 to 13.4.2 in /website (#17177 ) Bumps [next](https://github.com/vercel/next.js) from 12.3.1 to 13.4.2. - [Release notes](https://github.com/vercel/next.js/releases) - [Changelog](https://github.com/vercel/next.js/blob/canary/release.js) - [Commits](https://github.com/vercel/next.js/compare/v12.3.1...v13.4.2) --- updated-dependencies: - dependency-name: next dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-15 10:46:40 -04:00
Mark Lewis	1729c955d2	Update delete.mdx (#17184 ) Fix typo	2023-05-15 13:31:52 +01:00
Phil Renaud	a36af08ea6	[ui, deployments] Deployment history on steady state (#17167 ) * Deployment history on steady state * de-hds after chat	2023-05-12 16:49:40 -04:00
Tim Gross	d018fcbff7	allocrunner: provide factory function so we can build mock ARs (#17161 ) Tools like `nomad-nodesim` are unable to implement a minimal implementation of an allocrunner so that we can test the client communication without having to lug around the entire allocrunner/taskrunner code base. The allocrunner was implemented with an interface specifically for this purpose, but there were circular imports that made it challenging to use in practice. Move the AllocRunner interface into an inner package and provide a factory function type. Provide a minimal test that exercises the new function so that consumers have some idea of what the minimum implementation required is.	2023-05-12 13:29:44 -04:00
Phil Renaud	9a5d67d475	[ui] Keyboard shortcuts to switch regions (#17169 ) * Regions keynav * Dont show if you only have a single region (global by default)	2023-05-12 11:46:00 -04:00
Jai	e55edf58ff	chore: add percy tests (#17157 )	2023-05-12 09:57:22 -04:00
Jai	27f0d104e5	16664/upgrade (#17158 ) * chore: upgrade Upgrade @babel/helper-string-parserprop-types * chore: add resolution * chore: update component API for breaking changes * chore: update arguments * api: forgive user for pass wrong args * chore: update tests * chore: update yarn lock * chore: upgrade to Glimmer component * styling: add properties to component invocation * chore: add inset styles	2023-05-12 09:54:13 -04:00
Jai	ce29e55b7a	chore: write js doc (#17156 ) * chore: write jsdoc comments * chore: update comments	2023-05-11 15:30:54 -04:00
Seth Hoenig	81e36b3650	core: eliminate second index on job_submissions table (#17146 ) * core: eliminate second index on job_submissions table This PR refactors the job_submissions state store code to eliminate the use of a second index formerly used for purging all versions of a given job. In practice we ended up with duplicate entries on the table. Instead, use index prefix scanning on the primary index and tidy up any potential for creating (or removing) duplicates. * core: pr comments followup	2023-05-11 09:51:08 -05:00
Phil Renaud	a910e4be1c	[ui, deployments] Denominator based on completed allocations for batch jobs (#17147 ) * Denominator based on completed allocations for batch jobs * Test for denominatored batch job change	2023-05-11 10:23:23 -04:00
Tim Gross	9ed75e1f72	client: de-duplicate alloc updates and gate during restore (#17074 ) When client nodes are restarted, all allocations that have been scheduled on the node have their modify index updated, including terminal allocations. There are several contributing factors: * The `allocSync` method that updates the servers isn't gated on first contact with the servers. This means that if a server updates the desired state while the client is down, the `allocSync` races with the `Node.ClientGetAlloc` RPC. This will typically result in the client updating the server with "running" and then immediately thereafter "complete". * The `allocSync` method unconditionally sends the `Node.UpdateAlloc` RPC even if it's possible to assert that the server has definitely seen the client state. The allocrunner may queue-up updates even if we gate sending them. So then we end up with a race between the allocrunner updating its internal state to overwrite the previous update and `allocSync` sending the bogus or duplicate update. This changeset adds tracking of server-acknowledged state to the allocrunner. This state gets checked in the `allocSync` before adding the update to the batch, and updated when `Node.UpdateAlloc` returns successfully. To implement this we need to be able to equality-check the updates against the last acknowledged state. We also need to add the last acknowledged state to the client state DB, otherwise we'd drop unacknowledged updates across restarts. The client restart test has been expanded to cover a variety of allocation states, including allocs stopped before shutdown, allocs stopped by the server while the client is down, and allocs that have been completely GC'd on the server while the client is down. I've also bench tested scenarios where the task workload is killed while the client is down, resulting in a failed restore. Fixes #16381	2023-05-11 09:05:24 -04:00
Seth Hoenig	4abb3e03ca	cli: upload var file(s) content on job submission (#17128 ) This PR makes it so that the content of any -var-file files is uploaded to Nomad on job run.	2023-05-11 08:04:33 -05:00
Jai	24afd86cc5	ui: add sign-in link on err page (#17140 )	2023-05-11 08:24:58 -04:00
Luiz Aoqui	d800dc3367	deps: update go-metrics to prevent panic (#17133 ) nomad#15861 describes intermitent panics caused by go-metrics Prometheus client. We have not been able to further debug this problem due to the lack of information when the panic happens. go-metrics#146 prevents the panic from happening and also logs additional information that can help us understand the root cause of the problem. This commits pins the go-metric dependency to this branch until we can better debug the issue.	2023-05-10 21:33:15 -04:00

... 2 3 4 5 6 ...

24789 commits