open-nomad

Author	SHA1	Message	Date
Drew Bailey	1b8af920f3	address pr feedback	2020-01-09 15:15:09 -05:00
Drew Bailey	4ced73875b	leave acl checking to rpc endpoints fix test expectation test wrapNonJSON	2020-01-09 15:15:08 -05:00
Drew Bailey	279512c7f8	provide helpful error, cleanup logic	2020-01-09 15:15:08 -05:00
Drew Bailey	7bbba613a5	prevent doubly wrapping with rpc error	2020-01-09 15:15:07 -05:00
Drew Bailey	fd42020ad6	RPC server EnableDebug option Passes in agent enable_debug config to nomad server and client configs. This allows for rpc endpoints to have more granular control if they should be enabled or not in combination with ACLs. enable debug on client test	2020-01-09 15:15:07 -05:00
Drew Bailey	9a80938fb1	region forwarding; prevent recursive forwards for impossible requests prevent region forwarding loop, backfill tests fix failing test	2020-01-09 15:15:06 -05:00
Drew Bailey	46121fe3fd	move shared structs out of client and into nomad	2020-01-09 15:15:05 -05:00
Drew Bailey	3672414888	test pprof headers and profile methods tidy up, add comments clean up seconds param assignment	2020-01-09 15:15:04 -05:00
Drew Bailey	fc37448683	warn when enabled debug is on when registering m -> a receiver name return codederrors, fix query	2020-01-09 15:15:04 -05:00
Drew Bailey	62eb2d76a6	acl and debug test table rename implementation method	2020-01-09 15:15:03 -05:00
Drew Bailey	50288461c9	Server request forwarding for Agent.Profile Return rpc errors for profile requests, set up remote forwarding to target leader or server id for profile requests. server forwarding, endpoint tests	2020-01-09 15:15:03 -05:00
Drew Bailey	901f362858	test for known pprof endpoints	2020-01-09 15:15:02 -05:00
Drew Bailey	49ad5fbc85	agent pprof endpoints wip, agent endpoint and client endpoint for pprof profiles agent endpoint test	2020-01-09 15:15:02 -05:00
Mahmood Ali	a2e181dd45	CLI: protect against AllocatedResources being nil	2020-01-08 17:22:05 -05:00
Charlie Voiselle	5298fee5d6	Typo fix Synopsis needs to start with uppercase to match other commands	2020-01-08 10:44:00 -05:00
James Rasell	f2d1e45135	cli: include namespace in output when querying job stauts. (#6912 )	2020-01-08 08:24:03 -05:00
Michael Schurter	571ed261c8	Merge pull request #6898 from hashicorp/hicks/fix-typo Fix typo, Ethier -> Either	2020-01-02 14:52:18 -08:00
Kris Hicks	7fef7508cb	Fix typo, Ethier -> Either	2020-01-02 14:42:27 -08:00
Charlie Voiselle	fd3bf5f971	cli: Allow user to specify dest filename for nomad init (#6520 ) * Allow user to specify dest filename for nomad init * Create changelog entry for GH-6520	2019-12-19 14:59:12 -05:00
Drew Bailey	8e59e91991	Merge pull request #6746 from hashicorp/f-shutdown-delay-tg Group shutdown_delay	2019-12-18 16:01:30 -05:00
Lang Martin	06f441f562	test: quota: relax multierror message matching to Contains	2019-12-17 13:20:14 -05:00
Lang Martin	fb6c27b828	test: build quota_apply_test, remove the tests that require ent	2019-12-17 13:20:14 -05:00
Drew Bailey	d9e41d2880	docs for shutdown delay update docs, address pr comments ensure pointer is not nil use pointer for diff tests, set vs unset	2019-12-16 11:38:35 -05:00
Drew Bailey	24929776a2	shutdown delay for task groups copy struct values ensure groupserviceHook implements RunnerPreKillhook run deregister first test that shutdown times are delayed move magic number into variable	2019-12-16 11:38:16 -05:00
Mahmood Ali	76be9b4afb	cli: sequence cli.Ui operations Fixes a bug where if a command flag parsing errors, the resulting error and help usage messages get interleaved in unexpected and non-user friendly way. The reason is that we have flag parsing library effectively writes to ui.Error in a goroutine. This is problematic: first, we lose the sequencing between help usage and error message; second, cli.Ui methods are not concurrent safe. Here, we introduce a custom error writer that buffers result and calls ui.Error() in the write method and in the same goroutine. For context, we need to wrap ui.Error because it's line-oriented, while flags library expects a io.Writer which is bytes oriented.	2019-12-16 10:08:17 -05:00
Danielle	246a4e898b	Merge pull request #6828 from hashicorp/b/nomad-monitor-panic command: error when no node is found for `monitor`	2019-12-10 14:29:32 +01:00
Danielle Lancashire	cd764ab0e9	command: error when no node is found for `monitor` Currently `nomad monitor -node-id` will panic when a node-id does not match any nodes, as there is no empty result bounds checking. Here we return an error to the user when no nodes are found.	2019-12-10 13:10:47 +01:00
Seth Hoenig	f0c3dca49c	tests: swap lib/freeport for tweaked helper/freeport Copy the updated version of freeport (sdk/freeport), and tweak it for use in Nomad tests. This means staying below port 10000 to avoid conflicts with the lib/freeport that is still transitively used by the old version of consul that we vendor. Also provide implementations to find ephemeral ports of macOS and Windows environments. Ports acquired through freeport are supposed to be returned to freeport, which this change now also introduces. Many tests are modified to include calls to a cleanup function for Server objects. This should help quite a bit with some flakey tests, but not all of them. Our port problems will not go away completely until we upgrade our vendor version of consul. With Go modules, we'll probably do a 'replace' to swap out other copies of freeport with the one now in 'nomad/helper/freeport'.	2019-12-09 08:37:32 -06:00
Michael Schurter	3008473f9b	Merge branch 'master' into release-0102	2019-12-04 14:13:34 -08:00
Mahmood Ali	7b8cfee162	tests: deflake TestHTTP_FreshClientAllocMetrics The test asserts that alloc counts get reported accurately in metrics by inspecting the metrics endpoint directly. Sadly, the metrics as collected by `armon/go-metrics` seem to be stateful and may contain info from other tests. This means that the test can fail depending on the order of returned metrics. Inspecting the metrics output of one failing run, you can see the duplicate guage entries but for different node_ids: ``` { "Name": "service-name.default-0a3ba4b6-2109-485e-be74-6864228aed3d.client.allocations.terminal", "Value": 10, "Labels": { "datacenter": "dc1", "node_class": "none", "node_id": "67402bf4-00f3-bd8d-9fa8-f4d1924a892a" } }, { "Name": "service-name.default-0a3ba4b6-2109-485e-be74-6864228aed3d.client.allocations.terminal", "Value": 0, "Labels": { "datacenter": "dc1", "node_class": "none", "node_id": "a2945b48-7e66-68e2-c922-49b20dd4e20c" } }, ```	2019-11-22 18:41:21 -05:00
Nomad Release bot	db6420367d	Generate files for 0.10.2-rc1 release	2019-11-22 18:42:49 +00:00
Drew Bailey	b45ce9e997	add server-id to -h output	2019-11-21 16:04:28 -05:00
Drew Bailey	b3765b06ea	add server-id to -h output	2019-11-21 16:01:09 -05:00
Drew Bailey	6d5156bbba	Allows a node uuid prefix to be passed in	2019-11-21 15:15:41 -05:00
Drew Bailey	7ca6dbe61e	Allows a node uuid prefix to be passed in	2019-11-21 14:51:48 -05:00
Lang Martin	069e9a624b	command: quota init writes files with a network limit	2019-11-20 17:59:55 -06:00
Lang Martin	d2fc279af4	command: quota status reports network usage	2019-11-20 17:59:34 -06:00
Lang Martin	f45bebdb66	command: quota init writes files with a network limit	2019-11-20 18:44:06 -05:00
Lang Martin	2e2c662977	command: quota status reports network usage	2019-11-20 18:44:06 -05:00
Michael Schurter	48239d7f2e	Merge pull request #6017 from hashicorp/f-policy-json api: Add parsed rules to policy response	2019-11-20 15:31:03 -08:00
Mahmood Ali	be6c60455e	Merge pull request #6669 from hashicorp/b-cors-allow-credentials Allow UI to query client directly for task logs/state	2019-11-20 15:14:01 -05:00
Buck Doyle	db77a24ed3	Merge branch 'master' into f-policy-json	2019-11-20 11:20:07 -06:00
Michael Schurter	ecf970b5a5	Merge pull request #6370 from pmcatominey/tls-server-name command: add -tls-server-name flag	2019-11-20 08:44:54 -08:00
Preetha	42c1c85285	Merge pull request #6421 from hashicorp/b-acl-bootstrap-codes api: acl bootstrap errors aren't 500	2019-11-20 10:36:08 -06:00
Preetha	be4a51d5b8	Merge pull request #6349 from hashicorp/b-host-stats client: Return empty values when host stats fail	2019-11-20 10:13:02 -06:00
Buck Doyle	7e3188a4ea	CLI: Remove duplicated error output (#6738 )	2019-11-19 16:05:53 -06:00
Mahmood Ali	97974c4b13	Merge pull request #6684 from hashicorp/b-nomad-exec-stdout-tty nomad exec: check stdout for tty as well	2019-11-19 15:55:21 -05:00
Mahmood Ali	6f8bb5e90b	api: acl bootstrap errors aren't 500 Noticed that ACL endpoints return 500 status code for user errors. This is confusing and can lead to false monitoring alerts. Here, I introduce a concept of RPCCoded errors to be returned by RPC that signal a code in addition to error message. Codes for now match HTTP codes to ease reasoning. ``` $ nomad acl bootstrap Error bootstrapping: Unexpected response code: 500 (ACL bootstrap already done (reset index: 9)) $ nomad acl bootstrap Error bootstrapping: Unexpected response code: 400 (ACL bootstrap already done (reset index: 9)) ```	2019-11-19 15:51:57 -05:00
Tim Gross	1210261fe2	hclfmt nomad jobspecs (#6724 )	2019-11-19 10:36:41 -05:00
Nick Ethier	bd454a4c6f	client: improve group service stanza interpolation and check_re… (#6586 ) * client: improve group service stanza interpolation and check_restart support Interpolation can now be done on group service stanzas. Note that some task runtime specific information that was previously available when the service was registered poststart of a task is no longer available. The check_restart stanza for checks defined on group services will now properly restart the allocation upon check failures if configured.	2019-11-18 13:04:01 -05:00
Drew Bailey	9b63828658	serverID to target remote leader or server handle the case where we request a server-id which is this current server update docs, error on node and server id params more accurate names for tests use shared no leader err, formatting rm bad comment remove redundant variable	2019-11-14 10:07:35 -05:00
Drew Bailey	b644e1f47d	add server-id to monitor specific server	2019-11-14 09:53:41 -05:00
Drew Bailey	acd97d0731	Merge pull request #6670 from hashicorp/api/fallthrough-test test rootfallthrough handler	2019-11-13 10:51:31 -05:00
Lars Lehtonen	1dbf44bc40	command/agent: Prune Dead Code (#6682 ) * remove unused MockPeriodicJob() from tests * remove unused getIndex() from tests * remove unused checkIndex() from tests * remove unused assertIndex() from tests * remove unused Agent.findLoopbackDevice()	2019-11-13 08:20:01 -05:00
Lars Lehtonen	e85509c466	command: error handling before file close (#6681 )	2019-11-13 08:18:20 -05:00
Drew Bailey	f5310ff63f	fix so assertions are test case driven	2019-11-12 14:28:21 -05:00
Mahmood Ali	591cb75ee4	nomad exec: check stdout for tty as well When inferring whether to use TTY, check both stdin and stdout are terminals. Otherwise, we get failures like the following: ``` $ nomad alloc exec --job example echo hi hi $ echo \| nomad alloc exec --job example echo hi hi $ nomad alloc exec --job example echo hi \| head -n1 failed to exec into task: not a terminal ```	2019-11-12 11:39:06 -05:00
Lars Lehtonen	98d3e47b32	command: fix TestHelpers_LineLimitReader_TimeLimit() goroutine (#6678 )	2019-11-12 08:35:11 -05:00
Charlie Voiselle	835831a3d8	Added service wrapper code (#6220 ) This is the basic code to add the Windows Service Manager hooks to Nomad. Includes vendoring golang.org/x/sys/windows/svc and added Docs: * guide for installing as a windows service. * configuration for logging to file from PR #6429	2019-11-11 15:16:07 -05:00
Drew Bailey	f989f38594	test /ui/ path	2019-11-11 12:12:42 -05:00
Drew Bailey	a0548824f3	test rootfallthrough handler	2019-11-11 12:08:44 -05:00
Mahmood Ali	b2145f2d02	Allow UI to query client directly Nomad web UI currently fails when querying client nodes for allocation state end endpoints, due to CORS policy. The issue is that CORS requests that are marked `withCredentials` need the http server to include a `Access-Control-Allow-Credentials` [1]. But Nomad Task Logs and filesystem requests include authenticating information and thus marked with `credentials=true`[2][3]. It's worth noting that the browser currently sends credentials and authentication token to servers anyway; it's just that the response is not made available to caller nomad ui javascript. For task logs specifically, nomad ui retries again by querying the web ui address (typically pointing to a nomad server) which will forward the request to the nomad client agent appropriately. [1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Credentials [2] `101d0373ee/ui/app/components/task-log.js (L50)` [3] `101d0373ee/ui/app/services/token.js (L25-L39)`	2019-11-11 15:13:30 +00:00
Lars Lehtonen	08d5342812	command/agent: TestAgent_ServerConfig() fix dropped errors (#6659 )	2019-11-11 09:46:46 -05:00
Drew Bailey	04439a5a78	better func name, swap conditional	2019-11-11 08:35:56 -05:00
Drew Bailey	c85df2dac7	returns a 404 if not found instead of redirect to ui	2019-11-08 15:34:35 -05:00
Tim Gross	4909adb32c	fix broken test expectation from message change (#6635 )	2019-11-06 16:33:13 -05:00
Drew Bailey	7b2ad28ef6	unlock before returning, no need for label comment, trigger build return length written	2019-11-05 11:44:29 -05:00
Drew Bailey	d3b48a3e45	simplify logch goroutine	2019-11-05 11:44:28 -05:00
Drew Bailey	df57f70a68	wireup plain=true\|false query param	2019-11-05 11:44:28 -05:00
Drew Bailey	f4a7e3dc75	coordinate closing of doneCh, use interface to simplify callers comments	2019-11-05 11:44:26 -05:00
Drew Bailey	fe542680dc	log-json -> json fix typo command/agent/monitor/monitor.go Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com> Update command/agent/monitor/monitor.go Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com> address feedback, lock to prevent send on closed channel fix lock/unlock for dropped messages	2019-11-05 09:51:59 -05:00
Drew Bailey	84c8e79f90	simplify assert message	2019-11-05 09:51:56 -05:00
Drew Bailey	8726b685de	address feedback	2019-11-05 09:51:56 -05:00
Drew Bailey	e4b3e1d7d4	allow more time for streaming message remove unused struct	2019-11-05 09:51:55 -05:00
Drew Bailey	318b6c91bf	monitor command takes no args rm extra new line fix lint errors return after close fix, simplify test	2019-11-05 09:51:55 -05:00
Drew Bailey	0e759c401c	moving endpoints over to frames	2019-11-05 09:51:54 -05:00
Drew Bailey	c7b633b6c1	lock in sub select rm redundant lock wip to use framing wip switch to stream frames	2019-11-05 09:51:54 -05:00
Drew Bailey	fb23c1325d	fix deadlock issue, switch to frames envelope	2019-11-05 09:51:54 -05:00
Drew Bailey	32f62edbb0	return 400 if invalid log_json param is given Addresses feedback around monitor implementation subselect on stopCh to prevent blocking forever. Set up a separate goroutine to check every 3 seconds for dropped messages. rename returned ch to avoid confusion	2019-11-05 09:51:53 -05:00
Drew Bailey	17d876d5ef	rename function, initialize log level better underscores instead of dashes for query params	2019-11-05 09:51:53 -05:00
Drew Bailey	8e3915c7fc	use channel instead of empty string to determine close	2019-11-05 09:51:52 -05:00
Drew Bailey	da6229d704	update go-hclog dep remove duplicate lock	2019-11-05 09:51:52 -05:00
Drew Bailey	db65b1f4a5	agent:read acl policy for monitor	2019-11-05 09:51:52 -05:00
Drew Bailey	f46fd5b3e1	only look up rpchandler for node if we have nodeid fix some comments and nomad monitor -h output	2019-11-05 09:51:51 -05:00
Drew Bailey	3b9c33a5f0	new hclog with standardlogger intercept	2019-11-05 09:51:49 -05:00
Drew Bailey	a45ae1cd58	enable json formatting, use queryoptions	2019-11-05 09:51:49 -05:00
Drew Bailey	786989dbe3	New monitor pkg for shared monitor functionality Adds new package that can be used by client and server RPC endpoints to facilitate monitoring based off of a logger clean up old code small comment about write rm old comment about minsize rename to Monitor Removes connection logic from monitor command Keep connection logic in endpoints, use a channel to send results from monitoring use new multisink logger and interfaces small test for dropped messages update go-hclogger and update sink/intercept logger interfaces	2019-11-05 09:51:49 -05:00
Drew Bailey	e076204820	get local rpc endpoint working	2019-11-05 09:51:48 -05:00
Drew Bailey	976c43157c	remove log_writer prefix output with proper spacing update gzip handler, adjust first byte flow to allow gzip handler bypass wip, first stab at wiring up rpc endpoint	2019-11-05 09:51:48 -05:00
Drew Bailey	0de94466b2	Display error when remote side ended monitor multisink logger remove usage of logwriter	2019-11-05 09:51:48 -05:00
Drew Bailey	f60e44afc7	Adds nomad monitor command Adds nomad monitor command. Like consul monitor, this command allows you to stream logs from a nomad agent in real time with a a specified log level add endpoint tests Upgrade go-hclog to latest version The current version of go-hclog pads log prefixes to equal lengths so info becomes [INFO ] and debug becomes [DEBUG]. This breaks hashicorp/logutils/level.go Check function. Upgrading to the latest version removes this padding and fixes log filtering that uses logutils Check	2019-11-05 09:51:47 -05:00
Drew Bailey	b386119d15	Add Agent Monitor to receive streaming logs Queries /v1/agent/monitor and receives streaming logs from client	2019-11-05 09:51:47 -05:00
Drew Bailey	b0184e2032	Adds AgentMonitor Endpoint AgentMonitor is an endpoint to stream logs for a given agent. It allows callers to pass in a supplied log level, which may be different than the agents config allowing for temporary debugging with lower log levels. Pass in logWriter when setting up Agent	2019-11-05 09:51:46 -05:00
Drew Bailey	3a11f1f23a	Merge pull request #6609 from hashicorp/b-alloc-status-consistency Prevent nomad alloc status output inconsistency	2019-11-04 10:12:04 -05:00
Drew Bailey	a7adc54235	Prevent nomad alloc status output inconsistency Prevent random map ordering and sort alphabetically better variable name	2019-11-01 14:01:32 -04:00
Michael Schurter	9fed8d1bed	client: fix panic from 0.8 -> 0.10 upgrade makeAllocTaskServices did not do a nil check on AllocatedResources which causes a panic when upgrading directly from 0.8 to 0.10. While skipping 0.9 is not supported we intend to fix serious crashers caused by such upgrades to prevent cluster outages. I did a quick audit of the client package and everywhere else that accesses AllocatedResources appears to be properly guarded by a nil check.	2019-11-01 07:47:03 -07:00
Mahmood Ali	3f6e50617a	Merge pull request #6047 from hashicorp/b-ignore-server-if-disabled Only warn against BootstrapExpect set in CLI flag	2019-10-29 10:55:44 -04:00
Lang Martin	aa77ea4032	quota: parse network stanza in quotas (#6511 )	2019-10-24 10:41:54 -04:00
Michael Schurter	39437a5c5b	Merge branch 'master' into release-0100	2019-10-22 08:17:57 -07:00
Nomad Release bot	3e6c9dd40e	Generate files for 0.10.0 release	2019-10-22 12:34:56 +00:00
Seth Hoenig	8b03477f46	Merge pull request #6448 from hashicorp/f-set-connect-sidecar-tags connect: enable setting tags on consul connect sidecar service in job…	2019-10-17 15:14:09 -05:00
Seth Hoenig	039fbd3f3b	connect: enable setting tags on consul connect sidecar service in jobspec (#6415 )	2019-10-17 19:25:20 +00:00
Mahmood Ali	61e66cb077	Merge pull request #6427 from hashicorp/b-fs-endpoint-errors agent: report fs log errors as http errors	2019-10-15 20:12:59 -04:00
Mahmood Ali	88f8127820	tests: avoid using unnecessary pipe	2019-10-15 17:22:03 -04:00
Mahmood Ali	e6d5635e1a	Merge pull request #6425 from hashicorp/f-cli-show-full-ids cli: show full id for single node or alloc status	2019-10-15 10:54:25 -04:00
Danielle	fee482ae6c	Merge pull request #6331 from hashicorp/dani/f-volume-mount-propagation volumes: Add support for mount propagation	2019-10-14 14:29:40 +02:00
Danielle Lancashire	4fbcc668d0	volumes: Add support for mount propagation This commit introduces support for configuring mount propagation when mounting volumes with the `volume_mount` stanza on Linux targets. Similar to Kubernetes, we expose 3 options for configuring mount propagation: - private, which is equivalent to `rprivate` on Linux, which does not allow the container to see any new nested mounts after the chroot was created. - host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts that have been created _outside of the container_ to be visible inside the container after the chroot is created. - bidirectional, which is equivalent to `rshared` on Linux, which allows both the container to see new mounts created on the host, but importantly _allows the container to create mounts that are visible in other containers an don the host_ private and host-to-task are safe, but bidirectional mounts can be dangerous, as if the code inside a container creates a mount, and does not clean it up before tearing down the container, it can cause bad things to happen inside the kernel. To add a layer of safety here, we require that the user has ReadWrite permissions on the volume before allowing bidirectional mounts, as a defense in depth / validation case, although creating mounts should also require a priviliged execution environment inside the container.	2019-10-14 14:09:58 +02:00
Danielle	2640155ae5	Merge pull request #6429 from hashicorp/f-log-to-file Add support for logging to a file	2019-10-11 13:35:39 +02:00
Nomad Release bot	3007f1662e	Generate files for 0.10.0-rc1 release	2019-10-10 19:08:23 +00:00
Danielle Lancashire	5cedf6d024	logging: Correctly track number of written bytes Currently this assumes that a short write will never happen. While these are improbable in a case where rotation being off a few bytes would matter, this now correctly tracks the number of written bytes.	2019-10-10 14:02:14 +02:00
Danielle Lancashire	b67215d4f8	logging: Sort files when pruning old logs Currently this logging implementation is dependent on the order of files as returned by filepath.Glob, which although internal methods are documented to be lexographical, does not publicly document this. Here we defensively resort.	2019-10-10 13:51:16 +02:00
Mahmood Ali	4b2ba62e35	acl: check ACL against object namespace Fix a bug where a millicious user can access or manipulate an alloc in a namespace they don't have access to. The allocation endpoints perform ACL checks against the request namespace, not the allocation namespace, and performs the allocation lookup independently from namespaces. Here, we check that the requested can access the alloc namespace regardless of the declared request namespace. Ideally, we'd enforce that the declared request namespace matches the actual allocation namespace. Unfortunately, we haven't documented alloc endpoints as namespaced functions; we suspect starting to enforce this will be very disruptive and inappropriate for a nomad point release. As such, we maintain current behavior that doesn't require passing the proper namespace in request. A future major release may start enforcing checking declared namespace.	2019-10-08 12:59:22 -04:00
Mahmood Ali	3c0d8c7611	Merge pull request #6441 from hashicorp/b-agent-token Redact replication tokens in /agent/self	2019-10-08 12:55:45 -04:00
Danielle Lancashire	9eaac48f25	agent: Refactor log setup to support log-to-file	2019-10-07 14:42:32 +02:00
Danielle Lancashire	442f4888b3	agent: Introduce File Logger This commit introduces a rotating file logger for Nomad Agent Logs. The logger implementation itself is a lift and shift from Consul, with tests updated to fit with the Nomad pattern of using require, and not having a testutil for creating tempdirs cleanly.	2019-10-07 14:37:31 +02:00
Danielle Lancashire	d3614ea0a8	config: Add required configuration for logging to a file	2019-10-07 14:16:59 +02:00
Mahmood Ali	d09355efe4	cli: show full id for single node or alloc status Show full ID on individual alloc or node status views. Shortening the ID isn't very helpful in these cases, and makes looking up the full id slightly more complicated when user needs to interact with API. List views are unmodified and show short id unless `-vebose` flag is passed. Before ``` $ nomad node status -self \| head -n2 ID = 21fc51f9 Name = mars-2.local $ nomad alloc status 15ae54cd \| head -n3 ID = 15ae54cd-08dd-3681-03cf-4c23ace7e7c3 Eval ID = a6b15f86 Name = example.cache[0] ``` After: ``` $ nomad node status -self \| head -n2 ID = 21fc51f9-fd39-0fa0-fb41-f34c7aa36101 Name = mars-2.local $ nomad alloc status 15ae54cd \| head -n3 ID = 15ae54cd-08dd-3681-03cf-4c23ace7e7c3 Eval ID = a6b15f86-ca8e-e536-b544-4bfb43137ff3 Name = example.cache[0] ```	2019-10-04 16:36:18 -04:00
Mahmood Ali	317e0f9e44	agent: report fs log errors as http errors This fixes two bugs: First, FS Logs API endpoint only propagated error back to user if it was encoded with code, which isn't common. Other errors get suppressed and callers get an empty response with 200 error code. Now, these endpoints return a 500 status code along with the error message. Before ``` $ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout"; echo * Trying 127.0.0.1... * TCP_NODELAY set * Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0) > GET /v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout HTTP/1.1 > Host: 127.0.0.1:4646 > User-Agent: curl/7.54.0 > Accept: / > < HTTP/1.1 200 OK < Vary: Accept-Encoding < Vary: Origin < Date: Fri, 04 Oct 2019 19:47:21 GMT < Content-Length: 0 < * Connection #0 to host 127.0.0.1 left intact ``` After ``` $ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout"; echo * Trying 127.0.0.1... * TCP_NODELAY set * Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0) > GET /v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout HTTP/1.1 > Host: 127.0.0.1:4646 > User-Agent: curl/7.54.0 > Accept: / > < HTTP/1.1 500 Internal Server Error < Vary: Accept-Encoding < Vary: Origin < Date: Fri, 04 Oct 2019 19:48:12 GMT < Content-Length: 60 < Content-Type: text/plain; charset=utf-8 < * Connection #0 to host 127.0.0.1 left intact alloc lookup failed: index error: UUID must be 36 characters ``` Second, we return 400 status code for request validation errors. Before ``` $ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera"; echo * Trying 127.0.0.1... * TCP_NODELAY set * Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0) > GET /v1/client/fs/logs/qwerqwera HTTP/1.1 > Host: 127.0.0.1:4646 > User-Agent: curl/7.54.0 > Accept: / > < HTTP/1.1 500 Internal Server Error < Vary: Accept-Encoding < Vary: Origin < Date: Fri, 04 Oct 2019 19:47:29 GMT < Content-Length: 22 < Content-Type: text/plain; charset=utf-8 < * Connection #0 to host 127.0.0.1 left intact must provide task name ``` After ``` $ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera"; echo * Trying 127.0.0.1... * TCP_NODELAY set * Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0) > GET /v1/client/fs/logs/qwerqwera HTTP/1.1 > Host: 127.0.0.1:4646 > User-Agent: curl/7.54.0 > Accept: / > < HTTP/1.1 400 Bad Request < Vary: Accept-Encoding < Vary: Origin < Date: Fri, 04 Oct 2019 19:49:18 GMT < Content-Length: 22 < Content-Type: text/plain; charset=utf-8 < * Connection #0 to host 127.0.0.1 left intact must provide task name ```	2019-10-04 16:33:58 -04:00
Lang Martin	fb41dd86ba	default raft protocol v2	2019-09-24 14:37:55 -04:00
Peter McAtominey	de133d883f	command: add -tls-server-name flag	2019-09-24 09:20:41 -07:00
Tim Gross	cd9c23617f	client/connect: ConsulProxy LocalServicePort/Address (#6358 ) Without a `LocalServicePort`, Connect services will try to use the mapped port even when delivering traffic locally. A user can override this behavior by pinning the port value in the `service` stanza but this prevents us from using the Consul service name to reach the service. This commits configures the Consul proxy with its `LocalServicePort` and `LocalServiceAddress` fields.	2019-09-23 14:30:48 -04:00
Danielle Lancashire	39fe07f66b	api: Redact tokens in /agent/self	2019-09-23 19:07:27 +02:00
Danielle Lancashire	8b44369073	api: Redact ACL Replication Token Currently when hitting the /v1/agent/self API with ACL Replication enabled results in the token being returned in the API. This commit redacts that information, as it should be treated as a shared secret.	2019-09-22 14:35:53 +02:00
Chris Baker	6f38cca15a	fixed incorrect CLI documentation in `job deployments` listed `-all-allocs` instead of `-all`	2019-09-20 12:24:53 -05:00
Danielle Lancashire	e81d113e3f	command: Improve metrics fail logging	2019-09-19 04:17:42 +02:00
Mahmood Ali	b4a7585e5e	Merge pull request #6328 from hashicorp/b-gh-6269 cli: emit job version number proper	2019-09-17 19:06:44 -04:00
Tim Gross	e3e30c15a9	remove resolved TODO from UpdateTTL docstring (#6336 )	2019-09-16 16:26:06 -04:00
Mahmood Ali	df8a168d06	cli: emit job version number proper We must emit alloc job number rather than its the field address.	2019-09-13 19:04:32 -04:00
Danielle Lancashire	78b61de45f	config: Hoist volume.config.source into volume Currently, using a Volume in a job uses the following configuration: ``` volume "alias-name" { type = "volume-type" read_only = true config { source = "host_volume_name" } } ``` This commit migrates to the following: ``` volume "alias-name" { type = "volume-type" source = "host_volume_name" read_only = true } ``` The original design was based due to being uncertain about the future of storage plugins, and to allow maxium flexibility. However, this causes a few issues, namely: - We frequently need to parse this configuration during submission, scheduling, and mounting - It complicates the configuration from and end users perspective - It complicates the ability to do validation As we understand the problem space of CSI a little more, it has become clear that we won't need the `source` to be in config, as it will be used in the majority of cases: - Host Volumes: Always need a source - Preallocated CSI Volumes: Always needs a source from a volume or claim name - Dynamic Persistent CSI Volumes: Always needs a source to attach the volumes to for managing upgrades and to avoid dangling. - Dynamic Ephemeral CSI Volumes: Less thought out, but `source` will probably point to the plugin name, and a `config` block will allow you to pass meta to the plugin. Or will point to a pre-configured ephemeral config. *If implemented The new design simplifies this by merging the source into the volume stanza to solve the above issues with usability, performance, and error handling.	2019-09-13 04:37:59 +02:00
Mahmood Ali	877260afd8	fix 'nomad namespace apply' help Named arguments need to preceed positional arguments.	2019-09-09 10:04:41 -07:00
Nomad Release bot	dc7d728a82	Generate files for 0.10.0-beta1 release	2019-09-06 18:47:09 +00:00
Michael Schurter	31eb8375e5	Merge pull request #6282 from hashicorp/f-connect-dev-path connect: check if consul is on PATH	2019-09-05 12:25:23 -07:00
Michael Schurter	457684e34e	connect: check if consul is on PATH Only in -dev-connect mode for now since its valid to install Consul after Nomad has started in production.	2019-09-05 12:05:42 -07:00
Jasmine Dahilig	e1c73cdab5	add validation for job_gc_interval (#6277 )	2019-09-05 11:20:46 -07:00
Mahmood Ali	6d73ca0cfb	Merge pull request #6250 from hashicorp/f-raft-protocol-v3 Update default raft protocol to version 3	2019-09-04 09:34:41 -04:00
Tim Gross	0f29dcc935	support script checks for task group services (#6197 ) In Nomad prior to Consul Connect, all Consul checks work the same except for Script checks. Because the Task being checked is running in its own container namespaces, the check is executed by Nomad in the Task's context. If the Script check passes, Nomad uses the TTL check feature of Consul to update the check status. This means in order to run a Script check, we need to know what Task to execute it in. To support Consul Connect, we need Group Services, and these need to be registered in Consul along with their checks. We could push the Service down into the Task, but this doesn't work if someone wants to associate a service with a task's ports, but do script checks in another task in the allocation. Because Nomad is handling the Script check and not Consul anyways, this moves the script check handling into the task runner so that the task runner can own the script check's configuration and lifecycle. This will allow us to pass the group service check configuration down into a task without associating the service itself with the task. When tasks are checked for script checks, we walk back through their task group to see if there are script checks associated with the task. If so, we'll spin off script check tasklets for them. The group-level service and any restart behaviors it needs are entirely encapsulated within the group service hook.	2019-09-03 15:09:04 -04:00
Buck Doyle	21ec6a237c	Merge branch 'master' into f-policy-json # Conflicts: # CHANGELOG.md	2019-09-03 09:56:25 -05:00
Jasmine Dahilig	4edebe389a	add default update stanza and max_parallel=0 disables deployments (#6191 )	2019-09-02 10:30:09 -07:00
Evan Ercolano	fcf66918d0	Remove unused canary param from MakeTaskServiceID	2019-08-31 16:53:23 -04:00
Michael Schurter	4bd53deba9	Merge pull request #6236 from hashicorp/b-ignore-connect-services consul: ignore connect services when syncing	2019-08-30 13:11:09 -07:00
Michael Schurter	67b7bc1e90	consul: ignore connect services when syncing Consul registers Connect services automatically, however Nomad thinks it owns them due to the _nomad prefix. Since the services are managed by Consul, Nomad needs to explicitly ignore them or otherwies they will be removed.	2019-08-30 11:53:41 -07:00
Tim Gross	b79021adfd	cli: split -dev and -dev-connect flags	2019-08-30 09:33:30 -04:00
Buck Doyle	ab96785fc9	Change test to use valid HCL for rules	2019-08-29 16:09:02 -05:00
Mahmood Ali	6eabf53b91	Default raft protocol to version 3	2019-08-28 15:56:59 -04:00
Nick Ethier	9e96971a75	cli: display group ports and address in alloc status command output (#6189 ) * cli: display group ports and address in alloc status command output * add assertions for port.To = -1 case and convert assertions to testify	2019-08-27 23:59:36 -04:00
Jasmine Dahilig	ffceab0879	remove network stanza from job init --short example jobspec (#6179 )	2019-08-27 07:36:32 -07:00
Tim Gross	11030f7aa0	init: add generated assets into bindata	2019-08-26 14:24:15 -04:00
Tim Gross	4d4461d1f5	agent: -dev=connect mode bind to 0.0.0.0 The dev mode flag for connect was binding to the default interface's IP, but this makes for a bad user experience for the CLI which will default to 127.0.0.1. If we bind to 0.0.0.0 instead the CLI will work without further configuration by the user.	2019-08-23 13:51:16 -04:00
Jerome Gravel-Niquet	cbdc1978bf	Consul service meta (#6193 ) * adds meta object to service in job spec, sends it to consul * adds tests for service meta * fix tests * adds docs * better hashing for service meta, use helper for copying meta when registering service * tried to be DRY, but looks like it would be more work to use the helper function	2019-08-23 12:49:02 -04:00
Michael Schurter	95b8048553	Merge pull request #6121 from hashicorp/f-connect-bootstrap connect: task hook for bootstrapping envoy sidecar	2019-08-22 10:58:31 -07:00
Michael Schurter	59e0b67c7f	connect: task hook for bootstrapping envoy sidecar Fixes #6041 Unlike all other Consul operations, boostrapping requires Consul be available. This PR tries Consul 3 times with a backoff to account for the group services being asynchronously registered with Consul.	2019-08-22 08:15:32 -07:00
Danielle Lancashire	2e5f28029f	remove hidden field from host volumes We're not shipping support for "hidden" volumes in 0.10 any more, I'll convert this to an issue+mini RFC for future enhancement.	2019-08-22 08:48:05 +02:00
Danielle	c280e97619	Merge pull request #6184 from hashicorp/dani/fix-api api: Fix definition of HostVolumeInfo	2019-08-22 00:13:28 +02:00
Danielle Lancashire	112b986736	api: Fix definition of HostVolumeInfo	2019-08-21 22:34:41 +02:00
Danielle Lancashire	9df7e0eb72	clientconfig: Fix parsing multiple host volumes	2019-08-21 22:19:58 +02:00
Michael Schurter	050cc32fde	Merge pull request #6157 from hashicorp/f-connect-register Register connect enabled group services with Consul	2019-08-20 14:45:38 -07:00
Michael Schurter	b008fd1724	connect: register group services with Consul Fixes #6042 Add new task group service hook for registering group services like Connect-enabled services. Does not yet support checks.	2019-08-20 12:25:10 -07:00
Tim Gross	c404491f1f	test: require root for linux devmode test	2019-08-20 13:31:49 -04:00
Tim Gross	a0e923f46c	add optional task field to group service checks	2019-08-20 09:35:31 -04:00
Nick Ethier	24f5a4c276	sidecar_task override in connect admission controller (#6140 ) * structs: use seperate SidecarTask struct for sidecar_task stanza and add merge * nomad: merge SidecarTask into proxy task during connect Mutate hook	2019-08-20 01:22:46 -04:00
Tim Gross	2ab004d971	command: add `-connect` flag to job init Adds an example job for Consul Connect integration as well as an annotated example job.	2019-08-19 14:43:04 -04:00
Tim Gross	2a592a2e0c	agent: add optional param to -dev flag for connect (#6126 ) Consul Connect must route traffic between network namespaces through a public interface (i.e. not localhost). In order to support testing in dev mode, users needed to manually set the interface which doesn't make for a smooth experience. This commit adds a facility for adding optional parameters to the `nomad agent -dev` flag and uses it to add a `-dev=connect` flag that binds to a public interface on the host.	2019-08-14 15:29:37 -04:00
Tim Gross	13376cff9c	move `nomad init` outputs to go-bindata assets	2019-08-14 14:10:23 -04:00
Preetha	8c6312d973	Merge pull request #6097 from hashicorp/f-kind-validate Add validation for kind field if it is a consul connect proxy	2019-08-13 11:05:30 -05:00
Preetha Appan	72e45dd01e	More code review feedback	2019-08-12 17:41:40 -05:00
Tim Gross	03433f35d4	client/template: configuration for function blacklist and sandboxing When rendering a task template, the `plugin` function is no longer permitted by default and will raise an error. An operator can opt-in to permitting this function with the new `template.function_blacklist` field in the client configuration. When rendering a task template, path parameters for the `file` function will be treated as relative to the task directory by default. Relative paths or symlinks that point outside the task directory will raise an error. An operator can opt-out of this protection with the new `template.disable_file_sandbox` field in the client configuration.	2019-08-12 16:34:48 -04:00
Preetha Appan	35506c516d	Improve validation logic and add table driven tests	2019-08-12 14:39:50 -05:00
Danielle Lancashire	7208a7ab88	command: Cleanup node-status	2019-08-12 15:39:09 +02:00
Danielle Lancashire	333fdd723b	cli: Display host volume info in nomad node status	2019-08-12 15:39:09 +02:00
Danielle Lancashire	861caa9564	HostVolumeConfig: Source -> Path	2019-08-12 15:39:08 +02:00
Danielle Lancashire	e132a30899	structs: Unify Volume and VolumeRequest	2019-08-12 15:39:08 +02:00
Danielle Lancashire	01f3fe13fb	api: Allow submission of jobs with volumes	2019-08-12 15:39:08 +02:00
Danielle Lancashire	063e4240c1	client: Add parsing and registration of HostVolume configuration	2019-08-12 15:39:08 +02:00
Nick Ethier	1871c1edbc	Add sidecar_task stanza parsing (#6104 ) * jobspec: breakup parse.go into smaller files * add sidecar_task parsing to jobspec and api * jobspec: combine service parsing logic for task and group service stanzas * api: use slice of ConsulUpstream values instead of pointers	2019-08-09 15:18:53 -04:00
Preetha	1d543290af	Merge pull request #6090 from hashicorp/f-task-kind Add field "kind" to task for use in connect tasks	2019-08-08 14:40:12 -05:00
Nick Ethier	7806f4c597	Revert "client: add autofetch for CNI plugins" This reverts commit 0bd157cc3b04fb090dd0d54affcae71496102ce8.	2019-08-08 15:10:19 -04:00
Preetha Appan	a393ea79e8	Add field "kind" to task for use in connect tasks	2019-08-07 18:43:36 -05:00
Jasmine Dahilig	8d980edd2e	add create and modify timestamps to evaluations (#5881 )	2019-08-07 09:50:35 -07:00
Michael Schurter	d2862b33e6	Merge pull request #6045 from hashicorp/f-connect-groupservice consul: add Connect structs	2019-08-06 15:43:38 -07:00
Michael Schurter	17fd82d6ad	consul: add Connect structs Refactor all Consul structs into {api,structs}/services.go because api/tasks.go didn't make sense anymore and structs/structs.go is gigantic.	2019-08-06 08:15:07 -07:00
Jasmine Dahilig	ac488bc9dc	job region defaults to client node region if 'global' or none provided (#6064 )	2019-08-05 14:28:02 -07:00
Tim Gross	443ce3a831	api: add follow param to file stream endpoint (#6049 ) The `/v1/client/fs/stream endpoint` supports tailing a file by writing chunks out as they come in. But not all browsers support streams (ex IE11) so we need to be able to tail a file without streaming. The fs stream and logs endpoint use the same implementation for filesystem streaming under the hood, but the fs stream always passes the `follow` parameter set to true. This adds the same toggle to the fs stream endpoint that we have for logs. It defaults to true for backwards compatibility.	2019-08-01 08:32:43 -04:00
Mahmood Ali	31ad8161ab	Only warn against BootstrapExpect set in CLI flag If server.enabled is false, we ought to ignore all other values in the server stanza. However, I opted to preserve current error when `--bootstrap-expect` is passed to the CLI when server is not enabled, to maintain current behavior.	2019-07-31 03:19:15 -05:00
Nick Ethier	7de0bec8ab	client/cni: updated comments and simplified logic to auto download plugins	2019-07-31 01:04:10 -04:00
Nick Ethier	b16640c50d	Apply suggestions from code review Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>	2019-07-31 01:04:10 -04:00
Nick Ethier	af6b191963	client: add autofetch for CNI plugins	2019-07-31 01:04:09 -04:00
Nick Ethier	ef83f0831b	ar: plumb client config for networking into the network hook	2019-07-31 01:04:06 -04:00
Michael Schurter	fb487358fb	connect: add group.service stanza support	2019-07-31 01:04:05 -04:00
Nick Ethier	6537279686	agent: simplify if block	2019-07-31 01:03:17 -04:00
Nick Ethier	8650429e38	Add network stanza to group Adds a network stanza and additional options to the task group level in prep for allowing shared networking between tasks of an alloc.	2019-07-31 01:03:12 -04:00
Michael Schurter	d31488e262	Merge pull request #5978 from pete-woods/configurable-job-gc-interval command/agent: allow the job GC interval to be configured	2019-07-30 15:54:29 -07:00
Nomad Release bot	e39fb11531	Generate files for 0.9.4 release	2019-07-30 19:05:18 +00:00
Pete Woods	b47c5ca467	Allow the job GC interval to be configured from default of 5 minutes	2019-07-26 10:11:25 +01:00
Danielle	45f3f928f5	Merge pull request #5996 from hashicorp/f-reload-log-level Support for hot reloading log levels	2019-07-24 13:54:04 +02:00
Danielle Lancashire	0422f1b0c2	Support for hot reloading log levels	2019-07-24 13:37:08 +02:00
Nomad Release bot	04187c8b86	Generate files for 0.9.4-rc1 release	2019-07-22 21:42:36 +00:00
Danielle Lancashire	d454dab39b	chore: Format hcl configurations	2019-07-20 16:55:07 +02:00
Michael Schurter	db4de5fae9	Merge pull request #5975 from hashicorp/b-check-watcher-deadlock consul: fix deadlock in check-based restarts	2019-07-18 13:13:40 -07:00
Michael Schurter	6d095b3b36	consul: add test for check watcher deadlock	2019-07-18 08:24:09 -07:00
Michael Schurter	826d2503e6	Update command/agent/consul/check_watcher.go Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>	2019-07-18 07:08:27 -07:00
Michael Schurter	5407584bc3	consul: fix deadlock in check-based restarts Fixes #5395 Alternative to #5957 Make task restarting asynchronous when handling check-based restarts. This matches the pre-0.9 behavior where TaskRunner.Restart was an asynchronous signal. The check-based restarting code was not designed to handle blocking in TaskRunner.Restart. 0.9 made it reentrant and could easily overwhelm the buffered update chan and deadlock. Many thanks to @byronwolfman for his excellent debugging, PR, and reproducer! I created this alternative as changing the functionality of TaskRunner.Restart has a much larger impact. This approach reverts to old known-good behavior and minimizes the number of places changes are made.	2019-07-17 15:22:21 -07:00
Chris Baker	8a75afcb39	Merge pull request #5870 from hashicorp/b-nmd-1529-alloc-stop-missing-header api: return X-Nomad-Index header on allocation stop	2019-07-17 13:25:17 -04:00
Mahmood Ali	5d09b04f69	Merge pull request #5837 from hashicorp/b-consul-restore-sync-2 Avoid de-registering slowly restored services	2019-07-17 12:02:24 +08:00
Mahmood Ali	ec7e258d71	address review feedback	2019-07-17 10:43:13 +07:00
Eli Shvartsman	692fd19884	take NodeID from url in api for node eligibility	2019-07-15 18:34:53 +03:00
Preetha	5b83cd4ce0	Merge pull request #5894 from hashicorp/f-remove-deprecated-code Remove deprecated code	2019-07-02 09:29:24 -05:00
Preetha Appan	aa2b4b4e00	Undo removal of node drain compat changes Decided to remove that in 0.10	2019-07-01 15:12:01 -05:00
Preetha Appan	3345ce3ba4	Infer content type in alloc fs stat endpoint	2019-06-28 20:31:28 -05:00
Preetha Appan	f6fc5d40d1	one more drain test	2019-06-26 17:33:51 -05:00
Preetha Appan	67bf66efc6	remove now unneeded test	2019-06-26 16:59:23 -05:00
Preetha Appan	10e7d6df6d	Remove compat code associated with many previous versions of nomad This removes compat code for namespaces (0.7), Drain(0.8) and other older features from releases older than Nomad 0.7	2019-06-25 19:05:25 -05:00
Chris Baker	3429cf39ed	api: return X-Nomad-Index header on allocation stop	2019-06-21 16:20:06 +00:00
Chris Baker	59fac48d92	alloc lifecycle: 404 when attempting to stop non-existent allocation	2019-06-20 21:27:22 +00:00
Mahmood Ali	b209584dce	Merge pull request #5726 from hashicorp/b-plugins-via-init Use init() to handle plugin invocation	2019-06-18 21:09:03 -04:00
Mahmood Ali	e07413c420	Avoid de-registering slowly restored services When a nomad client restarts/upgraded, nomad restores state from running task and starts the sync loop. If sync loop runs early, it may deregister services from Consul prematurely even when Consul has the running service as healthy. This is not ideal, as re-registering the service means potentially waiting a whole service health check interval before declaring the service healthy. We attempt to mitigate this by introducing an initialization probation period. During this time, we only deregister services and checks that were explicitly deregistered, and leave unrecognized ones alone. This serves as a grace period for restoring to complete, or for operators to restore should they recognize they restored with the wrong nomad data directory.	2019-06-14 11:15:21 -04:00
Mahmood Ali	962921f86c	Use init to handle plugin invocation Currently, nomad "plugin" processes (e.g. executor, logmon, docker_logger) are started as CLI commands to be handled by command CLI framework. Plugin launchers use `discover.NomadBinary()` to identify the binary and start it. This has few downsides: The trivial one is that when running tests, one must re-compile the nomad binary as the tests need to invoke the nomad executable to start plugin. This is frequently overlooked, resulting in puzzlement. The more significant issue with `executor` in particular is in relation to external driver: * Plugin must identify the path of invoking nomad binary, which is not trivial; `discvoer.NomadBinary()` now returns the path to the plugin rather than to nomad, preventing external drivers from launching executors. * The external driver may get a different version of executor than it expects (specially if we make a binary incompatible change in future). This commit addresses both downside by having the plugin invocation handling through an `init()` call, similar to how libcontainer init handler is done in [1] and recommened by libcontainer [2]. `init()` will be invoked and handled properly in tests and external drivers. For external drivers, this change will cause external drivers to launch the executor that's compiled against. There a are a couple of downsides to this approach: * These specific packages (i.e executor, logmon, and dockerlog) need to be careful in use of `init()`, package initializers. Must avoid having command execution rely on any other init in the package. I prefixed files with `z_` (golang processes files in lexical order), but ensured we don't depend on order. * The command handling is spread in multiple packages making it a bit less obvious how plugin starts are handled. [1] drivers/shared/executor/libcontainer_nsenter_linux.go [2] `eb4aeed24f/libcontainer (using-libcontainer)`	2019-06-13 16:48:01 -04:00
Jasmine Dahilig	ed9740db10	Merge pull request #5664 from hashicorp/f-http-hcl-region backfill region from hcl for jobUpdate and jobPlan	2019-06-13 12:25:01 -07:00
Jasmine Dahilig	51e141be7a	backfill region from job hcl in jobUpdate and jobPlan endpoints - updated region in job metadata that gets persisted to nomad datastore - fixed many unrelated unit tests that used an invalid region value (they previously passed because hcl wasn't getting picked up and the job would default to global region)	2019-06-13 08:03:16 -07:00
Danielle	b7fc81031b	Merge pull request #5829 from hashicorp/dani/b-5819 consul: Include port-label in service registration	2019-06-13 16:20:45 +02:00
Danielle Lancashire	8112177503	consul: Include port-label in service registration It is possible to provide multiple identically named services with different port assignments in a Nomad configuration. We introduced a regression when migrating to stable service identifiers where multiple services with the same name would conflict, and the last definition would take precedence. This commit includes the port label in the stable service identifier to allow the previous behaviour where this was supported, for example providing: ```hcl service { name = "redis-cache" tags = ["global", "cache"] port = "db" check { name = "alive" type = "tcp" interval = "10s" timeout = "2s" } } service { name = "redis-cache" tags = ["global", "foo"] port = "foo" check { name = "alive" type = "tcp" port = "db" interval = "10s" timeout = "2s" } } service { name = "redis-cache" tags = ["global", "bar"] port = "bar" check { name = "alive" type = "tcp" port = "db" interval = "10s" timeout = "2s" } } ``` in a nomad task definition is now completely valid. Each service definition with the same name must still have a unique port label however.	2019-06-13 15:24:54 +02:00
Nick Ethier	1b7fa4fe29	Optional Consul service tags for nomad server and agent services (#5706 ) Optional Consul service tags for nomad server and agent services	2019-06-13 09:00:35 -04:00
Preetha	8a98817fe4	Merge pull request #5820 from hashicorp/r-assorted-changes-20190612_1 Assorted minor changes	2019-06-12 10:33:16 -05:00
Danielle Lancashire	ae8bb7365a	alloc-lifecycle: Fix restart with empty body Currently when you submit a manual request to the alloc lifecycle API with a version of Curl that will submit empty bodies, the alloc restart api will fail with an EOF error. This behaviour is undesired, as it is reasonable to not submit a body at all when restarting an entire allocation rather than an individual task. This fixes it by ignoring EOF (not unexpected EOF) errors and treating them as entire task restarts.	2019-06-12 15:35:00 +02:00
Mahmood Ali	b00d1f1e10	tests: parsing dir should be equivalent to parsing individual files	2019-06-12 08:19:09 -04:00
Mahmood Ali	3d8f2622e9	tests: avoid manipulating package variables	2019-06-12 08:16:32 -04:00
Lang Martin	3837c9b021	command add comments re: defaults to LoadConfig	2019-06-11 22:35:43 -04:00
Lang Martin	02aae678be	config_parse_test update comment for accuracy	2019-06-11 22:30:20 -04:00
Lang Martin	7aa95ebd6f	config_parse get rid of ParseConfigDefault	2019-06-11 22:00:23 -04:00
Lang Martin	9b0411af6a	Revert "config explicitly merge defaults once when using a config directory" This reverts commit 006a9a1d454739eee21b7d8abb8b7aef1353b648.	2019-06-11 22:00:23 -04:00
Lang Martin	1e2f87a11e	agent/testdata add a configuration directory for testing	2019-06-11 16:34:04 -04:00
Lang Martin	fe8a4781d8	config merge maintains *HCL string fields used for duration conversion	2019-06-11 16:34:04 -04:00
Lang Martin	3bd153690b	config_parse_test, handle defaults	2019-06-11 16:34:04 -04:00
Lang Martin	c97dd512f4	config explicitly merge defaults once when using a config directory	2019-06-11 15:42:27 -04:00
Lang Martin	ad56434472	config_parse split out defaults from ParseConfig	2019-06-11 15:42:27 -04:00
Lang Martin	28cf8eddfe	config parse_test check for string coercion in client.meta	2019-06-10 13:12:38 -04:00
Michael Schurter	073893f529	nomad: disable service+batch preemption by default Enterprise only. Disable preemption for service and batch jobs by default. Maintain backward compatibility in a x.y.Z release. Consider switching the default for new clusters in the future.	2019-06-04 15:54:50 -07:00
Mahmood Ali	a9f81f2daa	client config flag to disable remote exec This exposes a client flag to disable nomad remote exec support in environments where access to tasks ought to be restricted. I used `disable_remote_exec` client flag that defaults to allowing remote exec. Opted for a client config that can be used to disable remote exec globally, or to a subset of the cluster if necessary.	2019-06-03 15:31:39 -04:00
Nomad Release bot	6d6bc59732	Generate files for 0.9.2-rc1 release	2019-05-22 19:29:30 +00:00
Lang Martin	16cd0beb9b	api use job.update as the default for taskgroup.update	2019-05-22 12:34:57 -04:00
Lang Martin	b5fd735960	add update AutoPromote bool	2019-05-22 12:32:08 -04:00
Mahmood Ali	f5a4fcac3f	Restore tty start before emitting errors Otherwise, the error message appears indented unexpectedly.	2019-05-17 11:58:31 -04:00
Mahmood Ali	1293a8511c	Fix typos and comments Co-Authored-By: Michael Schurter <michael.schurter@gmail.com>	2019-05-16 17:06:03 -04:00
Mahmood Ali	689453bd3a	Implement escaping chrarcter for alloc exec	2019-05-16 16:22:52 -04:00
Preetha	2dcd4291f8	Merge pull request #5702 from hashicorp/f-filter-by-create-index Filter deployments by create index	2019-05-15 21:50:41 -05:00
Preetha Appan	2c5c16111e	Add -all to help text and flags	2019-05-15 21:16:57 -05:00
Mahmood Ali	bfd229918a	fix typo	2019-05-15 13:01:05 -04:00
Mahmood Ali	c057c6dc44	Merge pull request #5633 from hashicorp/f-nomad-exec-parts-02-cli nomad exec part 2: CLI	2019-05-15 12:50:42 -04:00
Mahmood Ali	778c7a1982	Handle Terminal Output state in Windows	2019-05-15 10:37:37 -04:00
Mahmood Ali	1104827671	Add clarifying comments for negating `-i` or `-t`	2019-05-15 10:35:12 -04:00
Preetha Appan	4f9c8ea068	Fix one more test set up	2019-05-14 16:13:41 -05:00
Nick Ethier	ade97bc91f	fixup #5172 and rebase against master	2019-05-14 14:37:34 -04:00
Nick Ethier	cab6a95668	Merge branch 'master' into pr/5172 * master: (912 commits) Update redirects.txt Added redirect for Spark guide link client: log when server list changes docs: mention regression in task config validation fix update to changelog update CHANGELOG with datacenter config validation https://github.com/hashicorp/nomad/pull/5665 typo: "atleast" -> "at least" implement nomad exec for rkt docs: fixed typo use pty/tty terminology similar to github.com/kr/pty vendor github.com/kr/pty drivers: implement streaming exec for executor based drivers executors: implement streaming exec executor: scaffolding for executor grpc handling client: expose allocated memory per task client improve a comment in updateNetworks stalebot: Add 'thinking' as an exempt label (#5684) Added Sparrow link update links to use new canonical location Add redirects for restructing done in GH-5667 ...	2019-05-14 14:10:33 -04:00
Preetha Appan	4d3f74e161	Fix test setup to have correct jobcreateindex for deployments	2019-05-13 18:53:47 -05:00
Preetha Appan	07690d6f9e	Add flag similar to --all for allocs to be able to filter deployments by latest	2019-05-13 18:33:41 -05:00
Mahmood Ali	2ddc39973d	Merge pull request #5668 from hashicorp/flaky-test-20190430 fix flaky test by allowing for call invocation overhead	2019-05-13 12:33:44 -04:00
Mahmood Ali	dd8762e348	typo: "atleast" -> "at least"	2019-05-13 10:01:19 -04:00
Mahmood Ali	513303347c	add CLI commands for nomad exec	2019-05-12 22:04:50 -04:00
Mahmood Ali	919827f2df	Merge pull request #5632 from hashicorp/f-nomad-exec-parts-01-base nomad exec part 1: plumbing and docker driver	2019-05-09 18:09:27 -04:00
Mahmood Ali	66982a1660	agent: add websocket handler for nomad exec This adds a websocket endpoint for handling `nomad exec`. The endpoint is a websocket interface, as we require a bi-directional streaming (to handle both input and output), which is not very appropriate for plain HTTP 1.0. Using websocket makes implementing the web ui a bit simpler. I considered using golang http hijack capability to treat http request as a plain connection, but the web interface would be too complicated potentially. Furthermore, the API endpoint operates against the raw core nomad exec streaming datastructures, defined in protobuf, with json serializer. Our APIs use json interfaces in general, and protobuf generates json friendly golang structs. Reusing the structs here simplify interface and reduce conversion overhead.	2019-05-09 16:49:08 -04:00
Danielle	4a22fa0ee2	Merge pull request #5536 from hashicorp/dani/consul Consul Catalog Integration Fixes	2019-05-09 13:22:54 +02:00
Danielle Lancashire	0da2924b2a	consul: Document example check id	2019-05-09 13:22:22 +02:00
Mahmood Ali	d405fcb093	fix flaky test by allowing for call invocation overhead	2019-05-08 18:04:37 -04:00
Preetha	1538913a2a	Merge pull request #5628 from hashicorp/f-preemption-config Add config to disable preemption for batch/service jobs	2019-05-06 15:40:35 -05:00
Lang Martin	9f3f11df97	Merge pull request #5601 from hashicorp/b-config-parse-direct-hcl config parse direct hcl	2019-05-06 12:05:19 -04:00
Preetha Appan	ad3c263d3f	Rename to match system scheduler config. Also added docs	2019-05-03 14:06:12 -05:00
Danielle Lancashire	d824e00d1a	consul: Do not deregister external checks This commit causes sync to skip deregistering checks that are not managed by nomad, such as service maintenance mode checks. This is handled in the same way as service registrations - by doing a Nomad specific prefix match.	2019-05-02 16:54:18 +02:00
Danielle Lancashire	0b8e85118e	consul: Use a stable identifier for services The current implementation of Service Registration uses a hash of the nomad-internal state of a service to register it with Consul, this means that any update to the service invalidates this name and we then deregister, and recreate the service in Consul. While this behaviour slightly simplifies reasoning about service registration, this becomes problematic when we add consul health checks to a service. When the service is re-registered, so are the checks, which default to failing for at least one check period. This commit migrates us to using a stable identifier based on the allocation, task, and service identifiers, and uses the difference between the remote and local state to decide when to push updates. It uses the existing hashing mechanic to decide when UpdateTask should regenerate service registrations for providing to Sync, but this should be removable as part of a future refactor. It additionally introduces the _nomad-check- prefix for check definitions, to allow for future allowing of consul features like maintenance mode.	2019-05-02 16:54:18 +02:00
Chris Baker	a40477a7b8	test case for 5540 (#5590 ) * client/metrics: modified metrics to use (updated) client copy of allocation instead of (unupdated) server copy * updated armon/go-metrics to address race condition in DisplayMetrics	2019-04-30 10:31:35 -04:00
Lang Martin	2e643d26a2	config_parse leave the *HCL strings in place after converting times	2019-04-30 10:30:53 -04:00
Lang Martin	3ba6095fe3	config_parse_test additional config confirmation w/ sample json	2019-04-30 10:30:53 -04:00
Lang Martin	fe9b31dcf9	config comment for future changes	2019-04-30 10:30:53 -04:00
Lang Martin	598112a1cc	tag HCL bookkeeping keys with json:"-" to keep them out of the api	2019-04-30 10:29:14 -04:00
Lang Martin	43407cffe3	config_parse_test remove redundant parse direct test	2019-04-30 10:29:14 -04:00
Lang Martin	b8e9c35cd0	config_parse remove unused multi-stage parsing via mapstructure	2019-04-30 10:29:14 -04:00
Lang Martin	1f86770456	config_parse_test test direct hcl parsing	2019-04-30 10:29:14 -04:00
Lang Martin	5ebae65d1a	agent/config, config/* mapstructure tags -> hcl tags	2019-04-30 10:29:14 -04:00
Lang Martin	92fd988c9f	config_parse add new ParseConfigFileDirectHCL - parse by using hcl.Decode directly - handle time.Duration strings in a second pass - report unexpected keys in a third pass	2019-04-30 10:29:14 -04:00
Preetha Appan	6615d5c868	Add config to disable preemption for batch/service jobs	2019-04-29 18:48:07 -05:00
Danielle Lancashire	a8880f9643	alloc_signal: Add autcompletion and cmd tests	2019-04-26 12:47:53 +02:00
Danielle Lancashire	3409e0be89	allocs: Add nomad alloc signal command This command will be used to send a signal to either a single task within an allocation, or all of the tasks if <task-name> is omitted. If the sent signal terminates the allocation, it will be treated as if the allocation has crashed, rather than as if it was operator-terminated. Signal validation is currently handled by the driver itself and nomad does not attempt to restrict or validate them.	2019-04-25 12:43:32 +02:00
Mahmood Ali	60ee243149	fix crash when executor parent nomad process dies Fixes https://github.com/hashicorp/nomad/issues/5593 Executor seems to die unexpectedly after nomad agent dies or is restarted. The crash seems to occur at the first log message after the nomad agent dies. To ease debugging we forward executor log messages to executor.log as well as to Stderr. `go-plugin` sets up plugins with Stderr pointing to a pipe being read by plugin client, the nomad agent in our case[1]. When the nomad agent dies, the pipe is closed, and any subsequent executor logs fail with ErrClosedPipe and SIGPIPE signal. SIGPIPE results into executor process dying. I considered adding a handler to ignore SIGPIPE, but hc-log library currently panics when logging write operation fails[2] This we opt to revert to v0.8 behavior of exclusively writing logs to executor.log, while we investigate alternative options. [1] https://github.com/hashicorp/nomad/blob/v0.9.0/vendor/github.com/hashicorp/go-plugin/client.go#L528-L535 [2] https://github.com/hashicorp/nomad/blob/v0.9.0/vendor/github.com/hashicorp/go-hclog/int.go#L320-L323	2019-04-23 09:52:46 -04:00
Danielle	198a838b61	Merge pull request #5512 from hashicorp/dani/f-alloc-stop alloc-lifecycle: nomad alloc stop	2019-04-23 13:05:08 +02:00
Danielle Lancashire	832f607433	allocs: Add nomad alloc stop This adds a `nomad alloc stop` command that can be used to stop and force migrate an allocation to a different node. This is built on top of the AllocUpdateDesiredTransitionRequest and explicitly limits the scope of access to that transition to expose it under the alloc-lifecycle ACL. The API returns the follow up eval that can be used as part of monitoring in the CLI or parsed and used in an external tool.	2019-04-23 12:50:23 +02:00
Michael Schurter	373748a327	Merge pull request #5486 from hashicorp/b-validate-migrate api: fix migrate stanza initialization	2019-04-15 09:44:59 -07:00
Chris Baker	3b9237de4a	gofmt/goimport and test formatting	2019-04-12 20:55:55 +00:00
Chris Baker	eca8a3d537	changes to appease gofmt	2019-04-12 19:12:42 +00:00
Chris Baker	b52d1c9274	cli: add support for periodic force evaluation resolves #3251	2019-04-12 18:56:35 +00:00
Chris Baker	5a43f10aaf	cli: add `acl token list` command, documentation docs: fix some incorrect acl policy docs (typos, copy-paste errors)	2019-04-12 15:48:36 +00:00
Michael Schurter	5e8e59eefb	api: fix migrate stanza initialization Fixes Migrate to be initialized like RescheduleStrategy. Fixes #5477	2019-04-11 15:29:19 -07:00
Danielle Lancashire	e135876493	allocs: Add nomad alloc restart This adds a `nomad alloc restart` command and api that allows a job operator with the alloc-lifecycle acl to perform an in-place restart of a Nomad allocation, or a given subtask.	2019-04-11 14:25:49 +02:00
Danielle	35f66d901f	Merge pull request #5516 from hashicorp/dani/f-verbose-status Allow passing -verbose to meta status	2019-04-11 13:31:48 +02:00
Danielle Lancashire	b4547a34b0	status: Allow passing -verbose to meta status A common issue when using nomad is needing to add in the object verb to a command to include the `-verbose` flag. This commit allows users to pass `-verbose` via the `nomad status` alias by adding a placeholder boolean in the metacommand which allows subcommands to parse the flag.	2019-04-11 13:15:44 +02:00
Chris Baker	ce0c330c7c	agent config: cleaner VAULT_ env lookup	2019-04-10 10:34:10 -05:00
Chris Baker	a26d4fe1e5	docs: -vault-namespace, VAULT_NAMESPACE, and config agent: added VAULT_NAMESPACE env-based configuration	2019-04-10 10:34:10 -05:00
Chris Baker	d3041cdb17	wip: added config parsing support, CLI flag, still need more testing, VAULT_ var, documentation	2019-04-10 10:34:10 -05:00
Chris Baker	6a2454f56d	"job revert" command: alphabetized flags	2019-04-10 10:34:10 -05:00
Chris Baker	2f4d8d0a2f	cli: plumbed vault token from job revert command through API call	2019-04-10 10:34:10 -05:00
Arshneet Singh	2b50b5499d	Remove redundant assertion and replace regex matches with require	2019-04-10 10:34:10 -05:00
Arshneet Singh	1272fcb9e1	Don't display node name if output isn't verbose. Add tests.	2019-04-10 10:34:10 -05:00
James Rasell	9470507cf4	Add NodeName to the alloc/job status outputs. Currently when operators need to log onto a machine where an alloc is running they will need to perform both an alloc/job status call and then a call to discover the node name from the node list. This updates both the job status and alloc status output to include the node name within the information to make operator use easier. Closes #2359 Cloess #1180	2019-04-10 10:34:10 -05:00

... 4 5 6 7 8 ...

2699 commits