open-nomad

Author	SHA1	Message	Date
Mahmood Ali	daf20f9788	vault: set JobID in Vault metadata (#11397 ) Closes: #11395 .	2021-10-27 07:20:29 -07:00
Mahmood Ali	e06ff1d613	scheduler: stop allocs in unrelated nodes (#11391 ) The system scheduler should leave allocs on draining nodes as-is, but stop node stop allocs on nodes that are no longer part of the job datacenters. Previously, the scheduler did not make the distinction and left system job allocs intact if they are already running. I've added a failing test first, which you can see in https://app.circleci.com/jobs/github/hashicorp/nomad/179661 . Fixes https://github.com/hashicorp/nomad/issues/11373	2021-10-27 07:04:13 -07:00
Mahmood Ali	f03d65062d	Fix arm64 panics by updating google/snappy library to latest, 0.0.4 (#11396 ) Pick up https://github.com/golang/snappy/pull/56 to handle arm64 architectures to fix panics. tldr; Golang 1.16 changed `memmove` implementation for arm64 requiring additional cpu registers that snappy wasn't preserving in its assembly implementation. Other projects have experienced this issue as well, searching for `encode_arm64.s:666` on your favorite search engine will reveal some. Vault updated the dependency earlier this August: https://github.com/hashicorp/vault/pull/12371 . I believe this issue affects Nomad 1.2.x and 1.1.x. Nomad 1.0.x use Golang 1.15 and isn't affected. However, backporting the change to 1.0.x should be harmless. Fixed https://github.com/hashicorp/nomad/issues/11385 .	2021-10-27 06:39:16 -07:00
James Rasell	e4f703b401	vagrantfile: expose Nomad and Consul APIs to local machine.	2021-10-27 12:15:37 +02:00
Luiz Aoqui	b463715a98	prevent active log from being overwritten when agent starts (#11386 )	2021-10-26 20:57:07 -04:00
Luiz Aoqui	ecc7a288ec	docs: add note and example of storing `nomad job plan` index to disk (#11377 )	2021-10-26 20:25:22 -04:00
Charlie Voiselle	7d02c8b605	DOCS: Update Consul Connect to Consul service mesh (#11362 ) * Update Consul Connect to Consul service mesh * Apply suggestions from code review	2021-10-26 15:10:21 -04:00
Noel Quiles	f16ef7f6fb	website: Add Fathom analytics (#11276 ) * Impl Fathom analytics * Actually install fathom-client * Use analytics package instead of direct impl * Remove explicit fathom-client dep * Upgrade platform analytics package	2021-10-25 15:23:38 -04:00
Luiz Aoqui	645a87f6b3	ui: update task group alloc summary chart to use new `SummaryLegendItem` component (#11375 )	2021-10-25 11:14:01 -04:00
Luiz Aoqui	979faf41e5	fix test names (#11374 )	2021-10-22 15:43:55 -04:00
Luiz Aoqui	3c22fc79a5	add dispatch idempotency token support in the CLI (#10930 )	2021-10-22 12:39:05 -04:00
Luiz Aoqui	2c7bfb7000	ui: persist node drain settings (#11368 )	2021-10-22 10:51:31 -04:00
Luiz Aoqui	dc5222f6e5	ui: display Nomad version in the Clients and Servers table (#11366 )	2021-10-22 10:33:06 -04:00
Luiz Aoqui	a7eb72f7d1	ui: use `get` to access job meta value (#11370 )	2021-10-22 10:05:48 -04:00
Luiz Aoqui	b73ecf684b	ui: update favicon (#11371 )	2021-10-22 09:40:38 -04:00
Luiz Aoqui	6853bf9632	cli: allow setting namespace and region in the `nomad ui` command (#11364 )	2021-10-21 16:24:39 -04:00
Luiz Aoqui	fce1a03897	ui: create tooltip component (#11363 )	2021-10-21 13:12:33 -04:00
Luiz Aoqui	362c8c54f4	ui: set * as the default namespace selector (#11357 )	2021-10-21 10:24:07 -04:00
Luiz Aoqui	dceeccfc5d	ui: add client name tooltip when displaying client ID in tables (#11358 )	2021-10-21 10:23:06 -04:00
James Rasell	6011411111	Merge pull request #11339 from hashicorp/b-website-fixup-interpolation-formatting website: fixup link formatting within interpolation doc.	2021-10-21 09:15:36 +02:00
Mahmood Ali	e992ebf58d	document GH-11346 fix (#11350 )	2021-10-20 22:03:19 -04:00
Brandon Romano	8c863288ed	Merge pull request #11356 from hashicorp/update-alert-banner Update HashiConf alert-banner expiration	2021-10-20 16:28:30 -07:00
Brandon Romano	5c4f4be3ca	Update HashiConf alert-banner expiration Updates the HashiConf Alert Banner expiration to 10/20 @ 11pm (PT)	2021-10-20 16:02:45 -07:00
Michael Schurter	37a8f27a35	Merge pull request #11331 from shishir-a412ed/init Add support for --init to docker driver.	2021-10-20 10:49:51 -07:00
Michael Schurter	f95f966e8b	Merge pull request #11347 from shishir-a412ed/cleanup Code cleanup: Remove extra if clause.	2021-10-20 09:37:10 -07:00
Mahmood Ali	1de395b42c	Fix preemption panic (#11346 ) Fix a bug where the scheduler may panic when preemption is enabled. The conditions are a bit complicated: A job with higher priority that schedule multiple allocations that preempt other multiple allocations on the same node, due to port/network/device assignments. The cause of the bug is incidental mutation of internal cached data. `RankedNode` computes and cache proposed allocations in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L42-L53 . But scheduler then mutates the list to remove pre-emptable allocs in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L293-L294, and `RemoveAllocs` mutates and sets the tail of cached slice with `nil`s triggering a nil-pointer derefencing case. I fixed the issue by avoiding the mutation in `RemoveAllocs` - the micro-optimization there doesn't seem necessary. Fixes https://github.com/hashicorp/nomad/issues/11342	2021-10-19 20:22:03 -04:00
Shishir Mahajan	dd93f72920	Code cleanup: Remove extra if clause. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2021-10-19 16:52:11 -07:00
Michael Schurter	081cfb85d7	docs: add #11331 to changelog	2021-10-19 16:30:06 -07:00
Michael Schurter	fd68bbc342	test: update tests to properly use AllocDir Also use t.TempDir when possible.	2021-10-19 10:49:07 -07:00
Brandon Romano	4d3bdc0dbf	Merge pull request #11341 from hashicorp/nq.update-alert-banner-hcg2021-live website: Update alert banner for HashiConf	2021-10-19 07:01:04 -07:00
Michael Schurter	d25b60a82d	docs: add #11334 to changelog	2021-10-18 09:22:01 -07:00
Michael Schurter	10c3bad652	client: never embed alloc_dir in chroot Fixes #2522 Skip embedding client.alloc_dir when building chroot. If a user configures a Nomad client agent so that the chroot_env will embed the client.alloc_dir, Nomad will happily infinitely recurse while building the chroot until something horrible happens. The best case scenario is the filesystem's path length limit is hit. The worst case scenario is disk space is exhausted. A bad agent configuration will look something like this: ```hcl data_dir = "/tmp/nomad-badagent" client { enabled = true chroot_env { # Note that the source matches the data_dir "/tmp/nomad-badagent" = "/ohno" # ... } } ``` Note that `/ohno/client` (the state_dir) will still be created but not `/ohno/alloc` (the alloc_dir). While I cannot think of a good reason why someone would want to embed Nomad's client (and possibly server) directories in chroots, there should be no cause for harm. chroots are only built when Nomad runs as root, and Nomad disables running exec jobs as root by default. Therefore even if client state is copied into chroots, it will be inaccessible to tasks. Skipping the `data_dir` and `{client,server}.state_dir` is possible, but this PR attempts to implement the minimum viable solution to reduce risk of unintended side effects or bugs. When running tests as root in a vm without the fix, the following error occurs: ``` === RUN TestAllocDir_SkipAllocDir alloc_dir_test.go:520: Error Trace: alloc_dir_test.go:520 Error: Received unexpected error: Couldn't create destination file /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/testtask/nomad/test/testtask/.../nomad/test/testtask/secrets/.nomad-mount: open /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/.../testtask/secrets/.nomad-mount: file name too long Test: TestAllocDir_SkipAllocDir --- FAIL: TestAllocDir_SkipAllocDir (22.76s) ``` Also removed unused Copy methods on AllocDir and TaskDir structs. Thanks to @eveld for not letting me forget about this!	2021-10-18 09:22:01 -07:00
Noel Quiles	ef533b6e3b	Update alert banner for HashiConf Final cleanup/closer exp date	2021-10-18 11:52:29 -04:00
James Rasell	2f5f6e0fdd	website: fixup link formatting within interpolation doc.	2021-10-18 12:21:05 +02:00
Andy Assareh	8c638217ac	exactly one of ingress, terminating, or mesh must be configured i believe mesh should be included in this statement was omitted.	2021-10-15 14:15:02 -07:00
Shishir Mahajan	d4daef7ebf	Add support for --init to docker driver. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2021-10-15 12:53:25 -07:00
Mahmood Ali	73351c35dd	ease building Linux binaries on macOS (#11329 ) Meant for development purposes only, so one can compile binary on a macos host then start a Docker container or scp the binary to a linux host easily. The resulting binary is statically linked and has very subtle differences. e.g. static binaries use go native network stack that honor /etc/hosts and /etc/resolve differently from the glibc implementation. In development environment, I don't expect these to materially change our experience.	2021-10-15 11:12:59 -04:00
Florian Apolloner	c762f64505	Follow up fixes for #11237 (#11260 )	2021-10-14 17:23:38 -04:00
Luiz Aoqui	130970e12e	Merge missing commits from 1.2.0-beta1 release branch (#11319 )	2021-10-14 16:10:05 -04:00
Luiz Aoqui	234aac14a8	Merge release branch (#11317 )	2021-10-14 13:06:04 -04:00
Luiz Aoqui	9d48daed8c	fix `nomad job allocs` command name (#11314 )	2021-10-14 12:44:59 -04:00
Luiz Aoqui	f1fb0987ab	docs: update Nvidia device plugin as external (#11313 )	2021-10-14 12:22:31 -04:00
Dave May	190716b4c6	Remove vendor folder during make clean (#11315 ) * Remove vendor folder during make clean * Add vendor warning to make dev build command	2021-10-14 11:32:19 -04:00
Luiz Aoqui	1bd9db3df0	changlog: add entry for #10796 (#11312 )	2021-10-14 09:01:43 -04:00
James Rasell	444d25db07	Merge pull request #11280 from benbuzbee/log-err Log error if there are no event handlers registered	2021-10-14 14:49:22 +02:00
Mahmood Ali	d5e136b82b	executor: set CpuWeight in cgroup-v2 (#11287 ) Cgroup-v2 uses `cpu.weight` property instead of cpu shares: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#cpu-interface-files . And it uses a different range (i.e. `[1, 10000]`) from cpu.shares (i.e. `[2, 262144]`) to make things more interesting. Luckily, the libcontainer provides a helper function to perform the conversion [`ConvertCPUSharesToCgroupV2Value`](https://pkg.go.dev/github.com/opencontainers/runc@v1.0.2/libcontainer/cgroups#ConvertCPUSharesToCgroupV2Value). I have confirmed that docker/libcontainer performs the conversion as well in https://github.com/opencontainers/runc/blob/v1.0.2/libcontainer/specconv/spec_linux.go#L536-L541 , and that CpuShares is ignored by libcontainer in https://github.com/opencontainers/runc/blob/v1.0.2/libcontainer/cgroups/fs2/cpu.go#L24-L29 .	2021-10-14 08:46:07 -04:00
Luiz Aoqui	536a5751ff	changelog: add entries for #9160 and #11078 (#11290 )	2021-10-14 08:43:36 -04:00
Charlie Voiselle	cb8e52b5df	Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799 )	2021-10-13 21:23:13 -04:00
Michael Schurter	59fda1894e	Merge pull request #11167 from a-zagaevskiy/master Support configurable dynamic port range	2021-10-13 16:47:38 -07:00
Michael Schurter	e14cd34392	client: improve errors & tests for dynamic ports	2021-10-13 16:25:25 -07:00

1 2 3 4 5 ...

21961 commits