open-nomad

Author	SHA1	Message	Date
hashicorp-copywrite[bot]	d797da4a3c	[COMPLIANCE] Add Copyright and License Headers (#17596 ) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>	2023-06-19 12:23:28 -04:00
Phil Renaud	8e41380f72	[ui, deployments] Promote Canary and Unhealthy Allocations in the deployment status panel (#17547 ) * A wild health status appears * autoPromote notification conditions * Legend fixes etc * Acceptance tests for new canary alerts	2023-06-19 12:06:18 -04:00
Luiz Aoqui	cfb3bb517f	np: scheduler configuration updates (#17575 ) * jobspec: rename node pool scheduler_configuration In HCL specifications we usually call configuration blocks `config` instead of `configuration`. * np: add memory oversubscription config * np: make scheduler config ENT	2023-06-19 11:41:46 -04:00
Dao Thanh Tung	b666857076	terraform: fix syntax in Azure example due to deprecated tf resource arguments (#17497 )	2023-06-19 11:26:14 +02:00
dependabot[bot]	0c63019c92	build(deps): bump github.com/stretchr/testify from 1.8.2 to 1.8.4 (#17584 )	2023-06-19 08:21:45 +01:00
Bruce Lok	72e92bc17f	fix typo peers.json (#17538 )	2023-06-19 07:56:51 +01:00
Michael Lange	00e04a4b54	Merge pull request #17573 from hashicorp/f/legacy-openssl UI Dev Tools: Use the legacy openssl provider for backcompat	2023-06-17 10:24:53 -07:00
Michael Lange	3ba7f4dae3	Use the legacy openssl provider for backcompat Node v18 uses a newer version of openssl than webpack 4 is compatible with. This is the quickest fix. The ideal fix would be to upgrade webpack to v5 but the state of Ember, Storybook, and generally just JS dep management makes this not an option.	2023-06-16 17:58:40 -07:00
Luiz Aoqui	d07f9ae2fe	cli: prevent panic if job node pool is nil (#17571 ) If the `nomad` CLI is used to access a cluster running a version that does not include node pools the command will `nil` panic when trying to resolve the job's node pool.	2023-06-16 17:08:36 -04:00
Luiz Aoqui	d5aa72190f	node pools: namespace integration (#17562 ) Add structs and fields to support the Nomad Pools Governance Enterprise feature of controlling node pool access via namespaces. Nomad Enterprise allows users to specify a default node pool to be used by jobs that don't specify one. In order to accomplish this, it's necessary to distinguish between a job that explicitly uses the `default` node pool and one that did not specify any. If the `default` node pool is set during job canonicalization it's impossible to do this, so this commit allows a job to have an empty node pool value during registration but sets to `default` at the admission controller mutator. In order to guarantee state consistency the state store validates that the job node pool is set and exists before inserting it.	2023-06-16 16:30:22 -04:00
Tim Gross	3da948d0c8	node pools: support `node.pool` constraint in scheduler (#17548 ) Although most of the time jobs will be assigned to a single node pool, users may want to set the node pool to "all" and then constraint to a subset of node pools. Add support for setting a contraint like `${node.pool}`.	2023-06-16 13:31:46 -04:00
Seth Hoenig	320bac0ac4	e2e: modernize podman test suite (#17564 ) Use the new style of e2e test for the podman suite ... which is all of one test case that was skipped out. Turn the case back on, and we will add more tests in the near future.	2023-06-16 10:36:17 -05:00
Tim Gross	f411f0c0fb	docs: node pool specification (#17553 )	2023-06-16 10:37:47 -04:00
Seth Hoenig	cafaf2e2ee	e2e: cleanup podman installation in jammy image (#17558 ) * e2e: cleanup podman installation in jammy image The original steps were copied over from the bionic image and does a lot of hoop jumping we do not need anymore. For the moment just hard-code installing the v0.4.2 version of the driver, but I may follow up and modify hc-install to support installing @latest like go itself. * use releases for hc-install	2023-06-15 18:17:31 -05:00
Michael Lange	cac5160aa8	Merge pull request #17516 from hashicorp/f/fix-storybook Fix Storybook	2023-06-15 14:31:40 -07:00
Seth Hoenig	c7b44a57a2	e2e: purge bionic packer image scripts (#17559 ) Bionic is dead, long live the Jammy!	2023-06-15 15:15:01 -05:00
Tim Gross	df366df1cd	docs: fix broken link in variables spec page (#17554 )	2023-06-15 15:57:00 -04:00
Michael Lange	65270115bf	Error free Storybook build	2023-06-15 12:43:15 -07:00
Michael Lange	9635bec8bb	Free your mind of the babel and the packed web ember-cli-storybook and storybook itself has progressed to the point where the DIY configs aren't necessary. It's all swept under the `framework: '@storybook/ember'` config in main.js. Yay!	2023-06-15 12:40:03 -07:00
Michael Lange	2b5c4d982c	Tidy up Storybook related packages It's unfortunate having to point to a hash for ember-cli-storybook, but there hasn't been a release since the environment PR merged. At least this is better than pointing at a fork?	2023-06-15 12:40:03 -07:00
Michael Lange	915ff5b19a	Minimum viable fix for Storybook Stories that used named blocked wouldn't render the named blocks. Evidently this was due to using a customized template renderer that became incompatible when Ember was upgraded.	2023-06-15 12:40:03 -07:00
Phil Renaud	7cfe2d09e0	Deployment history timeline styling (#17524 )	2023-06-15 14:42:32 -04:00
Patric Stout	4767d44b94	Fix DevicesSets being removed when cpusets are reloaded with cgroup v2 (#17535 ) * Fix DevicesSets being removed when cpusets are reloaded with cgroup v2 This meant that if any allocation was created or removed, all active DevicesSets were removed from all cgroups of all tasks. This was most noticeable with "exec" and "raw_exec", as it meant they no longer had access to /dev files. * e2e: add test for verifying cgroups do not interfere with access to devices --------- Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-06-15 09:39:36 -05:00
dependabot[bot]	2856967dda	build(deps-dev): bump webpack from 5.69.1 to 5.86.0 in /ui (#17488 ) Bumps [webpack](https://github.com/webpack/webpack) from 5.69.1 to 5.86.0. - [Release notes](https://github.com/webpack/webpack/releases) - [Commits](https://github.com/webpack/webpack/compare/v5.69.1...v5.86.0) --- updated-dependencies: - dependency-name: webpack dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-06-15 10:39:06 -04:00
Seth Hoenig	3e7007b2a3	tests: set timeout on test-ui (#17549 ) This seems to finish in about 20 minutes, or run for 6+ hours until hitting a default timeout. Set a timeout to 30 minutes so we aren't wasting time and runners.	2023-06-15 09:38:50 -05:00
Tim Gross	524183e2b1	docs: add missing `client.allocs` metrics (#17540 ) The docs were missing counter metrics emitted by the task runner around task state changes.	2023-06-15 09:18:11 -04:00
Luiz Aoqui	bdc7f3305f	rpc: fix log message in Node.UpdateStatus (#17537 )	2023-06-14 16:51:46 -04:00
Tim Gross	5b9322c70a	docs: clarify node pool apply/delete behavior (#17529 )	2023-06-14 15:58:53 -04:00
Tim Gross	dc9fae34ca	node pools: add pool as label on client metrics (#17528 ) This changeset adds the node pool as a label anywhere we're already emitting labels with additional information such as node class or ID about the client.	2023-06-14 15:58:38 -04:00
Tim Gross	5f509b8ce0	cli: fix missing `-quiet` flag for `var init` (#17526 ) The `var init` command was intended to have support for a `-quiet` flag but it was not documented and never parsed.	2023-06-14 14:52:46 -04:00
Tim Gross	736ad3ed32	docs: note namespace apply/delete behaviors, fix metric (#17527 ) This changeset includes some fixes to documentation discovered while working on node pools, but we didn't want to include in the node pool PRs so they can get backported easily: * namespace apply/delete commands are forwarded to the authoritative region * deleting a namespace requires there are no non-terminal jobs in any of the federated regions * fixed a typo in the name of the `nomad.client.allocated.disk` metric	2023-06-14 14:52:06 -04:00
Phil Renaud	7400c37b89	[ui] Job status panel: tooltips on individual allocs (#17514 ) * Tooltip on individual allocs in the panel * Isolate allocation cells to their own component * Tipsy trigger * Aria label for failed-or-lost tooltips * Buildfix * Try adding percy exec back to exam run	2023-06-14 12:45:36 -04:00
Luiz Aoqui	ec80d051d8	client: fix panic on alloc stop in non-Linux environments (#17515 ) Provide a no-op implementation of the drivers.DriverNetoworkManager interface to be used by systems that don't support network isolation and prevent panics where a network manager is expected.	2023-06-14 10:22:38 -04:00
James Rasell	70fc0fd701	build: add agent bindata file to copywrite ignore list. (#17507 )	2023-06-14 11:13:59 +01:00
Tim Gross	c1a01697c8	node pools: implement `node pool init` command (#17479 ) Implement a `nomad node pool init` command that generates an example spec file in either HCL or JSON format.	2023-06-13 14:51:29 -04:00
Luiz Aoqui	bc17cffaef	node pool: node pool upsert on multiregion node register (#17503 ) When registering a node with a new node pool in a non-authoritative region we can't create the node pool because this new pool will not be replicated to other regions. This commit modifies the node registration logic to only allow automatic node pool creation in the authoritative region. In non-authoritative regions, the client is registered, but the node pool is not created. The client is kept in the `initialing` status until its node pool is created in the authoritative region and replicated to the client's region.	2023-06-13 11:28:28 -04:00
Tim Gross	952eb2713e	node pools: protect against deleting occupied pools (#17457 ) We don't want to delete node pools that have nodes or non-terminal jobs. Add a check in the `DeleteNodePools` RPC to check locally and in federated regions, similar to how we check that it's safe to delete namespaces.	2023-06-13 09:57:42 -04:00
stswidwinski	9a58474400	conf: Add preemption_config to the server extra HCL keys which should be removed (#17481 ) Add preemption_config to the set of keys which should be pruned from the server config as described in #17480.	2023-06-13 10:48:19 +02:00
Daniel Bennett	fa8b102092	ci: remove circleci (#17502 ) all of our workflows are in GitHub Actions now 🎉	2023-06-12 16:28:19 -05:00
Tim Gross	e8a361310f	node pools: replicate from authoritative region (#17456 ) Upserts and deletes of node pools are forwarded to the authoritative region, just like we do for namespaces, quotas, ACL policies, etc. Replicate node pools from the authoritative region.	2023-06-12 13:24:24 -04:00
dependabot[bot]	d45bb4bab9	build(deps): bump github.com/hashicorp/go-plugin from 1.4.9 to 1.4.10 (#17486 )	2023-06-12 14:22:33 +01:00
Tim Gross	bb7f0edd6a	node pools: prevent panic on upsert during upgrades (#17474 ) Whenever we write a Raft log entry for node pools, we need to first make sure that all servers can safely apply the log without panicking. Gate upsert and delete RPCs on all servers being upgraded to the minimum version.	2023-06-12 09:01:30 -04:00
Tim Gross	e3a37c0b97	replication: fix potential panic during upgrades (#17476 ) If the authoritative region has been upgraded to a version of Nomad that has new replicated objects (such as ACL Auth Methods, ACL Binding Rules, etc.), the non-authoritative regions will start replicating those objects as soon as their leader is upgraded. If a server in the non-authoritative region is upgraded and then becomes the leader before all the other servers in the region have been upgraded, then it will attempt to write a Raft log entry that the followers don't understand. The followers will then panic. Add same the minimum version checks that we do for RPC writes to the leader's replication loop.	2023-06-12 08:53:56 -04:00
dependabot[bot]	8bd3bdab42	build(deps): bump github.com/shoenig/go-m1cpu from 0.1.5 to 0.1.6 (#17487 )	2023-06-12 12:08:16 +01:00
dependabot[bot]	c1f5ffb3bc	build(deps): bump github.com/fatih/color from 1.13.0 to 1.15.0 (#17485 )	2023-06-12 10:44:18 +01:00
Phil Renaud	6a9df6e3ab	[ui] Don't show a service as healthy when its parent alloc is not running (#17465 ) * Fix: dont show a service as healthy when its parent alloc is not running * Test for Health Unknown	2023-06-09 15:43:11 -04:00
Piotr Kazmierczak	57dad0ca07	docs: corrections and additional information for OIDC-related concepts (#17470 )	2023-06-09 16:50:22 +02:00
Piotr Kazmierczak	0a4052ece5	docs: add missing login API endpoint documentation (#17467 )	2023-06-09 15:59:01 +02:00
Seth Hoenig	557a6b4a5e	docker: stop network pause container of lost alloc after node restart (#17455 ) This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationSpec field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299	2023-06-09 08:46:29 -05:00
Phil Renaud	944f30674d	[ui] Parallelize ember tests (#17442 ) * Exam to parallelize tests * Logging to try to solve test flakiness * Logging in another failure * Hardening for one test and snapshot for another * Explicitly set the first one as the servicedAlloc instead of randomly picking * A wild CircleCI test failure appears * de-log	2023-06-07 17:01:35 -04:00

1 2 3 4 5 ...

24847 commits