open-nomad

Author	SHA1	Message	Date
Tim Gross	bcd26f8815	docker_logger: reorder imports to save memory (#14875 ) Nomad runs one logmon process and also one docker_logger process for each running allocation. A naive look at memory usage shows 10-30 MB of RSS, but a closer look shows that most of this memory (ex. all but ~2MB for logmon) is shared (`Shared_Clean` in Linux pmap). But a heap dump of docker_logger shows that it currently has an extra ~2500 KiB of heap (anonymously-mapped unshared memory) used for init blocks coming from the agent code (ex. mostly regexes from go-version, structs, and the Consul SDK). The packages for running logmon, docker_logger, and executor have an init block that parses `os.Args` to drop into their own logic, which prevents them from loading all the rest of the agent code and saves on memory, so this was unexpected. It looks like we accidentally reordered the imports in main to undo some of the work originally done in 404d2d4c98f1df930be1ae9852fe6e6ae8c1517e. This changeset restores the ordering. A follow-up heap dump shows this saves ~2MB of unshared RSS per docker_logger process.	2022-10-11 13:23:03 -04:00
Michael Schurter	45ce8c13cf	client: remove unused LogOutput and LogLevel (#14867 ) * client: remove unused LogOutput * client: remove unused config.LogLevel	2022-10-11 09:24:40 -07:00
Seth Hoenig	ba1e337f8b	helpers: lockfree lookup of nobody user on unix systems (#14866 ) * helpers: lockfree lookup of nobody user on linux and darwin This PR continues the nobody user lookup saga, by making the nobody user lookup lock-free on linux and darwin. By doing the lookup in an init block this originally broke on Windows, where we must avoid doing the lookup at all. We can get around that breakage by only doing the lookup on linux/darwin where the nobody user is going to exist. Also return the nobody user by value so that a copy is created that cannot be modified by callers of Nobody(). * helper: move nobody code into unix file	2022-10-11 08:38:05 -05:00
Seth Hoenig	1593963cd1	servicedisco: implicit constraint for nomad v1.4 when using nsd checks (#14868 ) This PR adds a jobspec mutator to constrain jobs making use of checks in the nomad service provider to nomad clients of at least v1.4.0. Before, in a mixed client version cluster it was possible to submit an NSD job making use of checks and for that job to land on an older, incompatible client node. Closes #14862	2022-10-11 08:21:42 -05:00
Seth Hoenig	69ced2a2bd	services: remove assertion on 'task' field being set (#14864 ) This PR removes the assertion around when the 'task' field of a check may be set. Starting in Nomad 1.4 we automatically set the task field on all checks in support of the NSD checks feature. This is causing validation problems elsewhere, e.g. when a group service using the Consul provider sets 'task' it will fail validation that worked previously. The assertion of leaving 'task' unset was only about making sure job submitters weren't expecting some behavior, but in practice is causing bugs now that we need the task field for more than it was originally added for. We can simply update the docs, noting when the task field set by job submitters actually has value.	2022-10-10 13:02:33 -05:00
Seth Hoenig	5e38a0e82c	cleanup: rename Equals to Equal for consistency (#14759 )	2022-10-10 09:28:46 -05:00
Seth Hoenig	0e702aec00	build: move imports into the transitive require block (#14863 )	2022-10-10 09:27:55 -05:00
Phil Renaud	e771b94164	[ui] Makes service tags wrap and look like tag items (#14834 ) * Makes service tags wrap and look like tag items * Add a little vertical spacing and changelog * Put client before tags * Force tags list to new line	2022-10-07 09:23:52 -04:00
Tim Gross	91d4ccd905	Dependency updates from dependabot (#14844 ) * build(deps): bump github.com/opencontainers/runc from 1.1.3 to 1.1.4 Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.1.3 to 1.1.4. - [Release notes](https://github.com/opencontainers/runc/releases) - [Changelog](https://github.com/opencontainers/runc/blob/v1.1.4/CHANGELOG.md) - [Commits](https://github.com/opencontainers/runc/compare/v1.1.3...v1.1.4) --- updated-dependencies: - dependency-name: github.com/opencontainers/runc dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/prometheus/client_golang Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.12.0 to 1.13.0. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](https://github.com/prometheus/client_golang/compare/v1.12.0...v1.13.0) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/mattn/go-colorable from 0.1.12 to 0.1.13 Bumps [github.com/mattn/go-colorable](https://github.com/mattn/go-colorable) from 0.1.12 to 0.1.13. - [Release notes](https://github.com/mattn/go-colorable/releases) - [Commits](https://github.com/mattn/go-colorable/compare/v0.1.12...v0.1.13) --- updated-dependencies: - dependency-name: github.com/mattn/go-colorable dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/google/go-cmp from 0.5.8 to 0.5.9 Bumps [github.com/google/go-cmp](https://github.com/google/go-cmp) from 0.5.8 to 0.5.9. - [Release notes](https://github.com/google/go-cmp/releases) - [Commits](https://github.com/google/go-cmp/compare/v0.5.8...v0.5.9) --- updated-dependencies: - dependency-name: github.com/google/go-cmp dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/miekg/dns from 1.1.41 to 1.1.50 Bumps [github.com/miekg/dns](https://github.com/miekg/dns) from 1.1.41 to 1.1.50. - [Release notes](https://github.com/miekg/dns/releases) - [Changelog](https://github.com/miekg/dns/blob/master/Makefile.release) - [Commits](https://github.com/miekg/dns/compare/v1.1.41...v1.1.50) --- updated-dependencies: - dependency-name: github.com/miekg/dns dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-10-07 09:09:28 -04:00
Damian Czaja	95f969c4bf	cli: add `nomad fmt` (#14779 )	2022-10-06 17:00:29 -04:00
Phil Renaud	4b93a30225	[ui] Line charts: explicitly update X-axis whenever xScale changes (#14814 ) * Explicitly update X-axis whenever xScale changes * Changelog	2022-10-06 16:59:16 -04:00
Hemanth Krishna	e516fc266f	enhancement: UpdateTask when Task is waiting for ShutdownDelay (#14775 ) Signed-off-by: Hemanth Krishna <hkpdev008@gmail.com>	2022-10-06 16:33:28 -04:00
Will Jordan	8ae13208c9	Allow jobs not requiring any network resources (#14300 ) Jobs not requiring any network resources should be allowed even when the network fingerprinter is disabled.	2022-10-06 16:25:41 -04:00
Gabriel Villalonga Simon	b974c32ba6	Check that JobPlanResponse Diff Type is None before checking for changes on getExitCode (#14492 )	2022-10-06 16:23:22 -04:00
Pablo Ruiz García	40416be7b1	Invoke FingerprintManager's Reload() func during agent's SIGHUP (#14615 ) Fixes #14614	2022-10-06 16:22:59 -04:00
Giovani Avelar	a625de2062	Allow specification of a custom job name/prefix for parameterized jobs (#14631 )	2022-10-06 16:21:40 -04:00
Tim Gross	6263c8b323	lock closed issues and PRs after 120 days (#14824 ) When community members comment on long-closed issues, there's a number of failure modes that make for a bad experience for them: * Their comments are often missed entirely because notification settings make it impractical for most developers to read comments on inactive issues. * In our experience, the problem is only rarely a regression; because failures are complex, totally different code paths can result in symptoms that initially appear to be the same but turn out to be completely different under close examination. This is particularly the case for issues fixed in very old versions (sometimes 2 or more years old). The Terraform core team uses a bot that locks issues after only 30 days. But because we typically close issues automatically on PR merge but don't have rolling releases, it'd frequently happen that unrelease fixes will have locked comments, which isn't a good experience either. I've looked through the pace of releases since Nomad 0.9.0 and the longest window between releases was 3 months. Set the window for the lock bot to 120 days to give us plenty of breathing room so it doesn't feel like we're shutting down discussion prematurely.	2022-10-06 16:18:00 -04:00
Michael Schurter	7bbbef9951	docs: clarify nomad vars vs vault (#14831 ) * docs: clarify nomad vars vs vault I think we should make the difference in root key management between Nomad and Vault clear in the concept docs. I didn't see anywhere else in the docs we compared it. I also s/secrets/variables everywhere except the first sentence since the feature is intended to be more generic than secrets. Right now it's more of a compliment to Consul's kv than Vault due to root key handling and featureset. * Update website/content/docs/concepts/variables.mdx Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-10-06 13:17:26 -07:00
HashiBot	eab6bb5e35	website: upgrade next version (#14830 ) Co-authored-by: Bryce Kalow <bkalow@hashicorp.com>	2022-10-06 13:48:11 -05:00
Tim Gross	80ec5e1346	fix panic from keyring raft entries being written during upgrade (#14821 ) During an upgrade to Nomad 1.4.0, if a server running 1.4.0 becomes the leader before one of the 1.3.x servers, the old server will crash because the keyring is initialized and writes a raft entry. Wait until all members are on a version that supports the keyring before initializing it.	2022-10-06 12:47:02 -04:00
Derek Strickland	36c644aaf2	Merge pull request #14828 from hashicorp/post-1.4.0-release Post 1.4.0 release	2022-10-06 09:30:56 -07:00
Derek Strickland	88ddf260da	Merge release 1.4.0 files	2022-10-06 09:24:54 -07:00
hc-github-team-nomad-core	bfd7159f42	Prepare for next release	2022-10-06 09:16:01 -07:00
hc-github-team-nomad-core	4fdcd197c0	Generate files for 1.4.0 release	2022-10-06 09:16:00 -07:00
Luiz Aoqui	f9aeb11183	prepare release 1.4.0	2022-10-06 09:16:00 -07:00
Tim Gross	0cc64da404	docs: 1.4.0 upgrade warning for keyring initialization (#14825 )	2022-10-06 11:32:35 -04:00
Tim Gross	0e1f8cd803	semgrep: add MeasureSinceWithLabels to FSM time rule (#14812 ) Metrics state is local to the server and needs to use time, which is normally forbidden in the FSM code. We have a bypass for this rule for `metrics.MeasureSince` but needed one for `metrics.MeasureSinceWithLabels` as well.	2022-10-06 10:59:53 -04:00
James Rasell	0187240e7c	e2e: fixes the ordering on greater than checks within spread test. (#14818 )	2022-10-06 15:27:36 +02:00
James Rasell	67e8f85360	e2e: fix incorrect must function usage in namespace suite. (#14805 )	2022-10-05 15:50:56 +02:00
Phil Renaud	7313ac2905	Switch to the 'running' green for health checks (#14799 )	2022-10-04 16:59:50 -04:00
Tim Gross	322ecc858f	client: defer `nobody` user lookup so Windows doesn't panic (#14790 ) In #14742 we introduced a cached lookup of the `nobody` user, which is only ever called on Unixish machines. But the initial caching was being done in an `init` block, which meant it was being run on Windows as well. This prevents the Nomad agent from starting on Windows. An alternative fix here would be to have a separate `init` block for Windows and Unix, but this potentially masks incorrect behavior if we accidentally added a call to the `Nobody()` method on Windows later. This way we're forced to handle the error in the caller.	2022-10-04 11:52:12 -04:00
Tim Gross	341dc84a77	variables: use correct URL in ref to docs (#14792 )	2022-10-04 11:30:49 -04:00
Tim Gross	2f3d4f51e6	deps: remove gophers.dev dependency (#14789 )	2022-10-04 09:49:50 -04:00
Tim Gross	a3ff23608c	deps: use install from current HEAD for `hc-install` (#14786 ) The `hc-install` tool we're using needed a patch for a specific bug, but that's since been merged. We definitely want to switch to using a standard release from that project once one is shipped with the CLI, but pinning to HEAD should keep us for now.	2022-10-04 08:22:30 -04:00
Michael Schurter	ed3218c3dd	Fixing flaky TestOverlap test (#14780 ) * test: ensure feasible node selected in overlap test * test: warn when getting close to retry limit	2022-10-03 14:35:02 -07:00
Elijah Voigt	0a80a58394	Docs(job-specification/periodic): Add enabled toggle (#14767 ) This is probably undocumented for a reason, but the `enabled` toggle in the `periodic` stanza is very useful so I figured I try adding it to the docs. The feature has been secretly avaliable since #9142 and was called out in that PR as being a dubious addition, only added to avoid regressions. The use case for disabling a periodic job in this way is to prevent it from running without modifying the schedule. Ideally Nomad would make it more clear that this was the case, and allow you to force a run of the job, but even with those rough edges I think users would benefit from knowing about this toggle.	2022-10-03 15:08:24 -04:00
Tim Gross	2a6e8be6ba	internals documentation with diagrams (#14750 ) This changeset adds new architecture internals documents to the contributing guide. These are intentionally here and not on the public-facing website because the material is not required for operators and includes a lot of diagrams that we can cheaply maintain with mermaid syntax but would involve art assets to have up on the main site that would become quickly out of date as code changes happen and be extremely expensive to maintain. However, these should be suitable to use as points of conversation with expert end users. Included: * A description of Evaluation triggers and expected counts, with examples. * A description of Evaluation states and implicit states. This is taken from an internal document in our team wiki. * A description of how writing the State Store works. This is taken from a diagram I put together a few months ago for internal education purposes. * A description of Evaluation lifecycle, from registration to running Allocations. This is mostly lifted from @lgfa29's amazing mega-diagram, but broken into digestible chunks and without multi-region deployments, which I'd like to cover in a future doc. Also includes adding Deployments to our public-facing glossary. Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2022-10-03 14:06:41 -04:00
Jai	bd8d023ee5	refact: only reload job if job has no taskGroups (#14760 )	2022-09-30 16:14:40 -04:00
dependabot[bot]	9ce74c83e6	build(deps-dev): bump @hashicorp/platform-cli in /website (#14541 ) Bumps [@hashicorp/platform-cli](https://github.com/hashicorp/web-platform-packages/tree/HEAD/packages/cli) from 2.1.0 to 2.3.0. - [Release notes](https://github.com/hashicorp/web-platform-packages/releases) - [Changelog](https://github.com/hashicorp/web-platform-packages/blob/main/packages/cli/CHANGELOG.md) - [Commits](https://github.com/hashicorp/web-platform-packages/commits/@hashicorp/platform-cli@2.3.0/packages/cli) --- updated-dependencies: - dependency-name: "@hashicorp/platform-cli" dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-09-30 14:59:55 -04:00
Luiz Aoqui	b924802958	template: apply splay value on change_mode script (#14749 ) Previously, the splay timeout was only applied if a template re-render caused a restart or a signal action. The `change_mode = "script"` was running after the `if restart \|\| len(signals) != 0` check, so it was invoked at all times. This change refactors the logic so it's easier to notice that new `change_mode` options should start only after `splay` is applied.	2022-09-30 12:04:22 -04:00
Tim Gross	e13ac471fc	Revert removing deprecated client options docs (#14753 ) This reverts PR #12416 and commit 6668ce022ac561f75ad113cc838b1fb786f11f79. While the driver options are well and truly deprecated, this documentation also covers features like `fingerprint.denylist` that are not available any other way. Let's revert this until #12420 is ready.	2022-09-30 08:38:03 -04:00
Phil Renaud	8ac604841a	[ui] Bugfix: reinstate the "this variable will be accessible by $job/$group/$task" notification (#14741 ) * When we isolated the variable form path to within its component for isolation reasons, we lost the model-level checks for related entites at type-time * Be a little more functionally pure * Use Ember.set to appease mirage	2022-09-29 10:40:00 -04:00
Phil Renaud	a200c2f2f2	Fix a bug where we only checked the first task within a given alloc for services (#14740 )	2022-09-29 10:39:42 -04:00
Seth Hoenig	c68ed3b4c8	client: protect user lookups with global lock (#14742 ) * client: protect user lookups with global lock This PR updates Nomad client to always do user lookups while holding a global process lock. This is to prevent concurrency unsafe implementations of NSS, but still enabling NSS lookups of users (i.e. cannot not use osusergo). * cl: add cl	2022-09-29 09:30:13 -05:00
Michael Schurter	0e95fb03c0	test: skip chown test if nonroot (#14738 ) CI always runs this as root, so it worked there and always scared me when I ran it locally.	2022-09-28 14:45:38 -07:00
Derek Strickland	2c4df95e92	Merge pull request #14664 from hashicorp/docs-multiregion-dispatch multiregion: Added a section for multiregion parameterized job dispatch	2022-09-28 15:40:11 -04:00
Derek Strickland	c3d4496287	link from dispatch command	2022-09-28 08:30:22 -04:00
Derek Strickland	8b37e558fb	Apply suggestions from code review	2022-09-28 08:18:56 -04:00
Derek Strickland	fe7d1e08ac	Update website/content/docs/job-specification/multiregion.mdx Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-09-28 07:20:11 -04:00
Derek Strickland	e1dba23ccf	Update website/content/docs/job-specification/multiregion.mdx Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-09-28 07:19:54 -04:00

1 2 3 4 5 ...

23847 commits