open-nomad

Author	SHA1	Message	Date
Tim Gross	52e805a6a6	csi: ensure Read/WriteAllocs aren't released early (#7841 ) We should only remove the `ReadAllocs`/`WriteAllocs` values for a volume after the claim has entered the "ready to free" state. The volume will eventually be released as expected. But querying the volume API will show the volume is released before the controller unpublish has finished and this can cause a race with starting new jobs. Test updates are to cover cases where we're dropping claims but not running through the whole reaping process.	2020-04-30 17:11:31 -04:00
Jasmine Dahilig	a9004faa11	UI: Add representations for task lifecycles (#7659 ) This adds details about task lifecycles to allocations, task groups, and tasks. It includes a live-updating timeline-like chart on allocations.	2020-04-30 08:15:19 -05:00
Tim Gross	a7a64443e1	csi: move volume claim release into volumewatcher (#7794 ) This changeset adds a subsystem to run on the leader, similar to the deployment watcher or node drainer. The `Watcher` performs a blocking query on updates to the `CSIVolumes` table and triggers reaping of volume claims. This will avoid tying up scheduling workers by immediately sending volume claim workloads into their own loop, rather than blocking the scheduling workers in the core GC job doing things like talking to CSI controllers The volume watcher is enabled on leader step-up and disabled on leader step-down. The volume claim GC mechanism now makes an empty claim RPC for the volume to trigger an index bump. That in turn unblocks the blocking query in the volume watcher so it can assess which claims can be released for a volume.	2020-04-30 09:13:00 -04:00
Michael Lange	c3085f04b6	Merge pull request #7820 from hashicorp/b-ui/ui-log-races UI: Log streaming bug fix medley	2020-04-29 18:06:47 -07:00
Michael Lange	21ef3633be	Make the no connection error on the logs page dismissable	2020-04-29 17:36:17 -07:00
Michael Lange	e74cd16252	Fix race condition where stdout and stderr requests can cause a no connection error This would happen because a no connection error happens after the second request fails, but that's because it's assumed the second request is to a server node. However, if a user clicks stderr fast enough, the first and second requests are both to the client node. This changes the logic to check if the request is to the server before deeming log streaming a total failure.	2020-04-29 17:36:17 -07:00
Michael Lange	aafbeaba75	Clicking stdout/stderr when already on that tab is now a noop	2020-04-29 17:36:16 -07:00
Michael Lange	7452a9a57d	Abort log fetch request when failing over from client to server Typically a failover means that the client can't be reached. However, if the client does eventually return after the timeout period, the log will stream indefinitely. This fixes that using an API that wasn't broadly available at the time this was first written.	2020-04-29 17:34:49 -07:00
Michael Lange	9ba563c48e	Always pass credential in fetch requests, but also treat options reasonably Now options can be provided without also having to remember to pass credentials. This is convenient for abort controller signals.	2020-04-29 17:34:49 -07:00
Seth Hoenig	dee7f3ea11	Merge pull request #7828 from hashicorp/b-ec2-speeds env_aws: use best-effort lookup table for CPU performance in EC2	2020-04-29 11:25:54 -06:00
Seth Hoenig	880c4e23d3	env_aws: combine 3 log lines into 1	2020-04-29 10:47:36 -06:00
Seth Hoenig	67303b666c	env_aws: downgrade log line Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>	2020-04-29 10:34:26 -06:00
Seth Hoenig	5ddc607701	env_aws: fixup log line Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>	2020-04-29 10:33:53 -06:00
Tim Gross	e34f099d20	csi: read-repair CSI volume claims (#7824 ) The `CSIVolumeClaim` fields were added after 0.11.1, so claims made before that may be missing the value. Repair this when we read the volume out of the state store. The `NodeID` field was added after 0.11.0, so we need to ensure it's been populated during upgrades from 0.11.0.	2020-04-29 11:57:19 -04:00
Buck Doyle	d4708860f0	UI: Fix exec popup link for job id ≠ name (#7815 ) This closes #7814. It makes URL-generation more central and changes the exec URL to include job id instead of name.	2020-04-29 07:54:04 -05:00
Mahmood Ali	0ab0463d20	Merge pull request #7829 from ccn/vendor-go-dockerclient-v1.6.5 Vendor: update fsouza/go-dockerclient to v1.6.5	2020-04-29 08:48:40 -04:00
ccn	889816d65c	Remove unused internal subpackages	2020-04-29 20:21:44 +08:00
ccn	a4c36add17	Vendor: update fsouza/go-dockerclient to v1.6.5	2020-04-29 18:54:55 +08:00
Seth Hoenig	f8596a3602	env_aws: use best-effort lookup table for CPU performance in EC2 Fixes #7681 The current behavior of the CPU fingerprinter in AWS is that it reads the current speed from `/proc/cpuinfo` (`CPU MHz` field). This is because the max CPU frequency is not available by reading anything on the EC2 instance itself. Normally on Linux one would look at e.g. `sys/devices/system/cpu/cpuN/cpufreq/cpuinfo_max_freq` or perhaps parse the values from the `CPU max MHz` field in `/proc/cpuinfo`, but those values are not available. Furthermore, no metadata about the CPU is made available in the EC2 metadata service. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-categories.html Since `go-psutil` cannot determine the max CPU speed it defaults to the current CPU speed, which could be basically any number between 0 and the true max. This is particularly bad on large, powerful reserved instances which often idle at ~800 MHz while Nomad does its fingerprinting (typically IO bound), which Nomad then uses as the max, which results in severe loss of available resources. Since the CPU specification is unavailable programmatically (at least not without sudo) use a best-effort lookup table. This table was generated by going through every instance type in AWS documentation and copy-pasting the numbers. https://aws.amazon.com/ec2/instance-types/ This approach obviously is not ideal as future instance types will need to be added as they are introduced to AWS. However, using the table should only be an improvement over the status quo since right now Nomad miscalculates available CPU resources on all instance types.	2020-04-28 19:01:33 -06:00
Mahmood Ali	18ac17b189	Merge pull request #7827 from hashicorp/deps-go-msgpack-v1.1.5 Harmonize go-msgpack/codec/codecgen	2020-04-28 18:13:09 -04:00
Mahmood Ali	18dba6fdad	Harmonize go-msgpack/codec/codecgen Use v1.1.5 of go-msgpack/codec/codecgen, so go-msgpack codecgen matches the library version. We branched off earlier to pick up `f51b518921` , but apparently that's not needed as we could customize the package via `-c` argument.	2020-04-28 17:12:31 -04:00
Tim Gross	4935b304a0	e2e: add helper to Makefile for local file deployments (#7822 )	2020-04-28 16:15:58 -04:00
Lang Martin	e32b5b12dd	command: deployment status without a prefix lists deployments (#7821 )	2020-04-28 15:11:32 -04:00
Mahmood Ali	18f16cfb12	Merge pull request #7818 from greut/codegen structs: give codecgen import	2020-04-28 12:16:41 -04:00
Buck Doyle	438aec636a	UI: update exec styles to match conventions (#7811 )	2020-04-28 08:33:07 -05:00
Chris Baker	315bcf1060	Merge pull request #7816 from hashicorp/b-7789-job-scaling-status-issues fix issues in Job.ScaleStatus	2020-04-28 06:33:42 -05:00
Yoan Blanc	5ca31f23e5	structs: give codecgen import Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-04-28 08:23:20 +02:00
Nick Ethier	4b810b697a	nomad: build dynamic port for exposed checks if not specified (#7800 )	2020-04-28 00:07:41 -04:00
Chris Baker	6e48d73be8	updated changelog	2020-04-27 21:46:56 +00:00
Chris Baker	73f1390316	modified Job.ScaleStatus to ignore deployments and look directly at the allocations, ignoring canaries	2020-04-27 21:45:39 +00:00
Charlie Voiselle	10ed58cee6	Adding API homepage to sidebar.	2020-04-27 13:41:11 -04:00
Charlie Voiselle	59470f4e90	Merge pull request #7801 from hashicorp/d-fix-docker-credhelper-example [docs] Update credential helper example in docker.mdx	2020-04-27 11:44:54 -04:00
Mahmood Ali	57008ce95a	Merge pull request #7809 from greut/typos api: fix some documentation typos	2020-04-27 08:50:25 -04:00
Mahmood Ali	f68bfa9e55	Merge pull request #7805 from hashicorp/vendor-go-metrics-v0.3.3 Vendor: update armon/go-metrics to v0.3.3	2020-04-27 08:49:50 -04:00
Yoan Blanc	417c2995c9	api: fix some documentation typos Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-04-27 10:25:29 +02:00
Mahmood Ali	e4f28e24a3	Vendor: update armon/go-metrics to v0.3.3 To pick up a lock contention fix in prometheus sink: https://github.com/armon/go-metrics/pull/107 .	2020-04-26 08:54:50 -04:00
Charlie Voiselle	f1ababc31b	Update docker.mdx	2020-04-24 23:20:02 -04:00
Charlie Voiselle	e5ebce0e6b	Merge pull request #7792 from angrycub/f-disable_dangling_container_gc Disable dangling container GC for demo	2020-04-24 23:12:16 -04:00
Seth Hoenig	4fa0c395df	Merge pull request #7784 from hashicorp/demo-grpc-checks demo: create a demo service for grpc healthchecks	2020-04-24 11:35:58 -06:00
Seth Hoenig	f2ef576510	demo: create a demo service for grpc healthchecks Examples for HTTP based task-group service healthchecks are covered by the `countdash` demo, but gRPC checks currently have no runnable examples. This PR adds a trivial gRPC enabled application that provides a Service implementing the standard gRPC healthcheck interface.	2020-04-24 10:59:50 -06:00
Tim Gross	bad9a82df8	ci: add a linting check for HCL files (#7791 ) Running `make dev` runs `hclfmt`, but this isn't checked as part of CI. That makes it possible to merge un-formatted HCL and Nomad jobspecs that later will make for dirty git staging areas when developers pull master. This changeset adds HCL linting to the `make check` target.	2020-04-23 14:32:44 -04:00
Charlie Voiselle	14b5a00932	Disable dangling container GC for demo	2020-04-23 11:51:03 -04:00
Tim Gross	083b35d651	csi: checkpoint volume claim garbage collection (#7782 ) Adds a `CSIVolumeClaim` type to be tracked as current and past claims on a volume. Allows for a client RPC failure during node or controller detachment without having to keep the allocation around after the first garbage collection eval. This changeset lays groundwork for moving the actual detachment RPCs into a volume watching loop outside the GC eval.	2020-04-23 11:06:23 -04:00
Tim Gross	e7e9c83aa3	website: fix path for spellchecking and correct errors (#7790 )	2020-04-23 10:38:08 -04:00
Chris Baker	2f7372d29d	Merge pull request #7788 from hashicorp/b-7716-scaling-policy-parsing parsing should error if scaling block includes multiple policy blocks	2020-04-23 08:57:31 -05:00
Chris Baker	beeccc26e4	changelog entries for 7772 and 7788	2020-04-23 12:45:52 +00:00
Chris Baker	8ea4a7e84b	return parsing error if scaling policy includes more than one policy block also, check that parsing a minimal scaling block doesn't throw any errors	2020-04-23 12:37:45 +00:00
Michael Lange	0dac605902	Merge pull request #7689 from hashicorp/ui/plumb-proxy-config-to-proxy UI Plumb proxy config to proxy	2020-04-22 19:31:27 -07:00
Mahmood Ali	018e39b456	Merge pull request #7785 from hashicorp/b-http-fail-log-level http: adjust log level for request failure	2020-04-22 17:03:11 -04:00
Mahmood Ali	b8fb32f5d2	http: adjust log level for request failure Failed requests due to API client errors are to be marked as DEBUG. The Error log level should be reserved to signal problems with the cluster and are actionable for nomad system operators. Logs due to misbehaving API clients don't represent a system level problem and seem spurius to nomad maintainers at best. These log messages can also be attack vectors for deniel of service attacks by filling servers disk space with spurious log messages.	2020-04-22 16:19:59 -04:00

1 2 3 4 5 ...

17990 commits