open-nomad

Author	SHA1	Message	Date
Nick Ethier	b16640c50d	Apply suggestions from code review Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>	2019-07-31 01:04:10 -04:00
Nick Ethier	af6b191963	client: add autofetch for CNI plugins	2019-07-31 01:04:09 -04:00
Nick Ethier	ef83f0831b	ar: plumb client config for networking into the network hook	2019-07-31 01:04:06 -04:00
Michael Schurter	fb487358fb	connect: add group.service stanza support	2019-07-31 01:04:05 -04:00
Nick Ethier	6537279686	agent: simplify if block	2019-07-31 01:03:17 -04:00
Nick Ethier	8650429e38	Add network stanza to group Adds a network stanza and additional options to the task group level in prep for allowing shared networking between tasks of an alloc.	2019-07-31 01:03:12 -04:00
Michael Schurter	d31488e262	Merge pull request #5978 from pete-woods/configurable-job-gc-interval command/agent: allow the job GC interval to be configured	2019-07-30 15:54:29 -07:00
Nomad Release bot	e39fb11531	Generate files for 0.9.4 release	2019-07-30 19:05:18 +00:00
Pete Woods	b47c5ca467	Allow the job GC interval to be configured from default of 5 minutes	2019-07-26 10:11:25 +01:00
Danielle	45f3f928f5	Merge pull request #5996 from hashicorp/f-reload-log-level Support for hot reloading log levels	2019-07-24 13:54:04 +02:00
Danielle Lancashire	0422f1b0c2	Support for hot reloading log levels	2019-07-24 13:37:08 +02:00
Nomad Release bot	04187c8b86	Generate files for 0.9.4-rc1 release	2019-07-22 21:42:36 +00:00
Danielle Lancashire	d454dab39b	chore: Format hcl configurations	2019-07-20 16:55:07 +02:00
Michael Schurter	db4de5fae9	Merge pull request #5975 from hashicorp/b-check-watcher-deadlock consul: fix deadlock in check-based restarts	2019-07-18 13:13:40 -07:00
Michael Schurter	6d095b3b36	consul: add test for check watcher deadlock	2019-07-18 08:24:09 -07:00
Michael Schurter	826d2503e6	Update command/agent/consul/check_watcher.go Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>	2019-07-18 07:08:27 -07:00
Michael Schurter	5407584bc3	consul: fix deadlock in check-based restarts Fixes #5395 Alternative to #5957 Make task restarting asynchronous when handling check-based restarts. This matches the pre-0.9 behavior where TaskRunner.Restart was an asynchronous signal. The check-based restarting code was not designed to handle blocking in TaskRunner.Restart. 0.9 made it reentrant and could easily overwhelm the buffered update chan and deadlock. Many thanks to @byronwolfman for his excellent debugging, PR, and reproducer! I created this alternative as changing the functionality of TaskRunner.Restart has a much larger impact. This approach reverts to old known-good behavior and minimizes the number of places changes are made.	2019-07-17 15:22:21 -07:00
Chris Baker	8a75afcb39	Merge pull request #5870 from hashicorp/b-nmd-1529-alloc-stop-missing-header api: return X-Nomad-Index header on allocation stop	2019-07-17 13:25:17 -04:00
Mahmood Ali	5d09b04f69	Merge pull request #5837 from hashicorp/b-consul-restore-sync-2 Avoid de-registering slowly restored services	2019-07-17 12:02:24 +08:00
Mahmood Ali	ec7e258d71	address review feedback	2019-07-17 10:43:13 +07:00
Eli Shvartsman	692fd19884	take NodeID from url in api for node eligibility	2019-07-15 18:34:53 +03:00
Preetha	5b83cd4ce0	Merge pull request #5894 from hashicorp/f-remove-deprecated-code Remove deprecated code	2019-07-02 09:29:24 -05:00
Preetha Appan	aa2b4b4e00	Undo removal of node drain compat changes Decided to remove that in 0.10	2019-07-01 15:12:01 -05:00
Preetha Appan	3345ce3ba4	Infer content type in alloc fs stat endpoint	2019-06-28 20:31:28 -05:00
Preetha Appan	f6fc5d40d1	one more drain test	2019-06-26 17:33:51 -05:00
Preetha Appan	67bf66efc6	remove now unneeded test	2019-06-26 16:59:23 -05:00
Preetha Appan	10e7d6df6d	Remove compat code associated with many previous versions of nomad This removes compat code for namespaces (0.7), Drain(0.8) and other older features from releases older than Nomad 0.7	2019-06-25 19:05:25 -05:00
Chris Baker	3429cf39ed	api: return X-Nomad-Index header on allocation stop	2019-06-21 16:20:06 +00:00
Chris Baker	59fac48d92	alloc lifecycle: 404 when attempting to stop non-existent allocation	2019-06-20 21:27:22 +00:00
Mahmood Ali	b209584dce	Merge pull request #5726 from hashicorp/b-plugins-via-init Use init() to handle plugin invocation	2019-06-18 21:09:03 -04:00
Mahmood Ali	e07413c420	Avoid de-registering slowly restored services When a nomad client restarts/upgraded, nomad restores state from running task and starts the sync loop. If sync loop runs early, it may deregister services from Consul prematurely even when Consul has the running service as healthy. This is not ideal, as re-registering the service means potentially waiting a whole service health check interval before declaring the service healthy. We attempt to mitigate this by introducing an initialization probation period. During this time, we only deregister services and checks that were explicitly deregistered, and leave unrecognized ones alone. This serves as a grace period for restoring to complete, or for operators to restore should they recognize they restored with the wrong nomad data directory.	2019-06-14 11:15:21 -04:00
Mahmood Ali	962921f86c	Use init to handle plugin invocation Currently, nomad "plugin" processes (e.g. executor, logmon, docker_logger) are started as CLI commands to be handled by command CLI framework. Plugin launchers use `discover.NomadBinary()` to identify the binary and start it. This has few downsides: The trivial one is that when running tests, one must re-compile the nomad binary as the tests need to invoke the nomad executable to start plugin. This is frequently overlooked, resulting in puzzlement. The more significant issue with `executor` in particular is in relation to external driver: * Plugin must identify the path of invoking nomad binary, which is not trivial; `discvoer.NomadBinary()` now returns the path to the plugin rather than to nomad, preventing external drivers from launching executors. * The external driver may get a different version of executor than it expects (specially if we make a binary incompatible change in future). This commit addresses both downside by having the plugin invocation handling through an `init()` call, similar to how libcontainer init handler is done in [1] and recommened by libcontainer [2]. `init()` will be invoked and handled properly in tests and external drivers. For external drivers, this change will cause external drivers to launch the executor that's compiled against. There a are a couple of downsides to this approach: * These specific packages (i.e executor, logmon, and dockerlog) need to be careful in use of `init()`, package initializers. Must avoid having command execution rely on any other init in the package. I prefixed files with `z_` (golang processes files in lexical order), but ensured we don't depend on order. * The command handling is spread in multiple packages making it a bit less obvious how plugin starts are handled. [1] drivers/shared/executor/libcontainer_nsenter_linux.go [2] `eb4aeed24f/libcontainer (using-libcontainer)`	2019-06-13 16:48:01 -04:00
Jasmine Dahilig	ed9740db10	Merge pull request #5664 from hashicorp/f-http-hcl-region backfill region from hcl for jobUpdate and jobPlan	2019-06-13 12:25:01 -07:00
Jasmine Dahilig	51e141be7a	backfill region from job hcl in jobUpdate and jobPlan endpoints - updated region in job metadata that gets persisted to nomad datastore - fixed many unrelated unit tests that used an invalid region value (they previously passed because hcl wasn't getting picked up and the job would default to global region)	2019-06-13 08:03:16 -07:00
Danielle	b7fc81031b	Merge pull request #5829 from hashicorp/dani/b-5819 consul: Include port-label in service registration	2019-06-13 16:20:45 +02:00
Danielle Lancashire	8112177503	consul: Include port-label in service registration It is possible to provide multiple identically named services with different port assignments in a Nomad configuration. We introduced a regression when migrating to stable service identifiers where multiple services with the same name would conflict, and the last definition would take precedence. This commit includes the port label in the stable service identifier to allow the previous behaviour where this was supported, for example providing: ```hcl service { name = "redis-cache" tags = ["global", "cache"] port = "db" check { name = "alive" type = "tcp" interval = "10s" timeout = "2s" } } service { name = "redis-cache" tags = ["global", "foo"] port = "foo" check { name = "alive" type = "tcp" port = "db" interval = "10s" timeout = "2s" } } service { name = "redis-cache" tags = ["global", "bar"] port = "bar" check { name = "alive" type = "tcp" port = "db" interval = "10s" timeout = "2s" } } ``` in a nomad task definition is now completely valid. Each service definition with the same name must still have a unique port label however.	2019-06-13 15:24:54 +02:00
Nick Ethier	1b7fa4fe29	Optional Consul service tags for nomad server and agent services (#5706 ) Optional Consul service tags for nomad server and agent services	2019-06-13 09:00:35 -04:00
Preetha	8a98817fe4	Merge pull request #5820 from hashicorp/r-assorted-changes-20190612_1 Assorted minor changes	2019-06-12 10:33:16 -05:00
Danielle Lancashire	ae8bb7365a	alloc-lifecycle: Fix restart with empty body Currently when you submit a manual request to the alloc lifecycle API with a version of Curl that will submit empty bodies, the alloc restart api will fail with an EOF error. This behaviour is undesired, as it is reasonable to not submit a body at all when restarting an entire allocation rather than an individual task. This fixes it by ignoring EOF (not unexpected EOF) errors and treating them as entire task restarts.	2019-06-12 15:35:00 +02:00
Mahmood Ali	b00d1f1e10	tests: parsing dir should be equivalent to parsing individual files	2019-06-12 08:19:09 -04:00
Mahmood Ali	3d8f2622e9	tests: avoid manipulating package variables	2019-06-12 08:16:32 -04:00
Lang Martin	3837c9b021	command add comments re: defaults to LoadConfig	2019-06-11 22:35:43 -04:00
Lang Martin	02aae678be	config_parse_test update comment for accuracy	2019-06-11 22:30:20 -04:00
Lang Martin	7aa95ebd6f	config_parse get rid of ParseConfigDefault	2019-06-11 22:00:23 -04:00
Lang Martin	9b0411af6a	Revert "config explicitly merge defaults once when using a config directory" This reverts commit 006a9a1d454739eee21b7d8abb8b7aef1353b648.	2019-06-11 22:00:23 -04:00
Lang Martin	1e2f87a11e	agent/testdata add a configuration directory for testing	2019-06-11 16:34:04 -04:00
Lang Martin	fe8a4781d8	config merge maintains *HCL string fields used for duration conversion	2019-06-11 16:34:04 -04:00
Lang Martin	3bd153690b	config_parse_test, handle defaults	2019-06-11 16:34:04 -04:00
Lang Martin	c97dd512f4	config explicitly merge defaults once when using a config directory	2019-06-11 15:42:27 -04:00
Lang Martin	ad56434472	config_parse split out defaults from ParseConfig	2019-06-11 15:42:27 -04:00
Lang Martin	28cf8eddfe	config parse_test check for string coercion in client.meta	2019-06-10 13:12:38 -04:00
Michael Schurter	073893f529	nomad: disable service+batch preemption by default Enterprise only. Disable preemption for service and batch jobs by default. Maintain backward compatibility in a x.y.Z release. Consider switching the default for new clusters in the future.	2019-06-04 15:54:50 -07:00
Mahmood Ali	a9f81f2daa	client config flag to disable remote exec This exposes a client flag to disable nomad remote exec support in environments where access to tasks ought to be restricted. I used `disable_remote_exec` client flag that defaults to allowing remote exec. Opted for a client config that can be used to disable remote exec globally, or to a subset of the cluster if necessary.	2019-06-03 15:31:39 -04:00
Nomad Release bot	6d6bc59732	Generate files for 0.9.2-rc1 release	2019-05-22 19:29:30 +00:00
Lang Martin	16cd0beb9b	api use job.update as the default for taskgroup.update	2019-05-22 12:34:57 -04:00
Lang Martin	b5fd735960	add update AutoPromote bool	2019-05-22 12:32:08 -04:00
Mahmood Ali	f5a4fcac3f	Restore tty start before emitting errors Otherwise, the error message appears indented unexpectedly.	2019-05-17 11:58:31 -04:00
Mahmood Ali	1293a8511c	Fix typos and comments Co-Authored-By: Michael Schurter <michael.schurter@gmail.com>	2019-05-16 17:06:03 -04:00
Mahmood Ali	689453bd3a	Implement escaping chrarcter for alloc exec	2019-05-16 16:22:52 -04:00
Preetha	2dcd4291f8	Merge pull request #5702 from hashicorp/f-filter-by-create-index Filter deployments by create index	2019-05-15 21:50:41 -05:00
Preetha Appan	2c5c16111e	Add -all to help text and flags	2019-05-15 21:16:57 -05:00
Mahmood Ali	bfd229918a	fix typo	2019-05-15 13:01:05 -04:00
Mahmood Ali	c057c6dc44	Merge pull request #5633 from hashicorp/f-nomad-exec-parts-02-cli nomad exec part 2: CLI	2019-05-15 12:50:42 -04:00
Mahmood Ali	778c7a1982	Handle Terminal Output state in Windows	2019-05-15 10:37:37 -04:00
Mahmood Ali	1104827671	Add clarifying comments for negating `-i` or `-t`	2019-05-15 10:35:12 -04:00
Preetha Appan	4f9c8ea068	Fix one more test set up	2019-05-14 16:13:41 -05:00
Nick Ethier	ade97bc91f	fixup #5172 and rebase against master	2019-05-14 14:37:34 -04:00
Nick Ethier	cab6a95668	Merge branch 'master' into pr/5172 * master: (912 commits) Update redirects.txt Added redirect for Spark guide link client: log when server list changes docs: mention regression in task config validation fix update to changelog update CHANGELOG with datacenter config validation https://github.com/hashicorp/nomad/pull/5665 typo: "atleast" -> "at least" implement nomad exec for rkt docs: fixed typo use pty/tty terminology similar to github.com/kr/pty vendor github.com/kr/pty drivers: implement streaming exec for executor based drivers executors: implement streaming exec executor: scaffolding for executor grpc handling client: expose allocated memory per task client improve a comment in updateNetworks stalebot: Add 'thinking' as an exempt label (#5684) Added Sparrow link update links to use new canonical location Add redirects for restructing done in GH-5667 ...	2019-05-14 14:10:33 -04:00
Preetha Appan	4d3f74e161	Fix test setup to have correct jobcreateindex for deployments	2019-05-13 18:53:47 -05:00
Preetha Appan	07690d6f9e	Add flag similar to --all for allocs to be able to filter deployments by latest	2019-05-13 18:33:41 -05:00
Mahmood Ali	2ddc39973d	Merge pull request #5668 from hashicorp/flaky-test-20190430 fix flaky test by allowing for call invocation overhead	2019-05-13 12:33:44 -04:00
Mahmood Ali	dd8762e348	typo: "atleast" -> "at least"	2019-05-13 10:01:19 -04:00
Mahmood Ali	513303347c	add CLI commands for nomad exec	2019-05-12 22:04:50 -04:00
Mahmood Ali	919827f2df	Merge pull request #5632 from hashicorp/f-nomad-exec-parts-01-base nomad exec part 1: plumbing and docker driver	2019-05-09 18:09:27 -04:00
Mahmood Ali	66982a1660	agent: add websocket handler for nomad exec This adds a websocket endpoint for handling `nomad exec`. The endpoint is a websocket interface, as we require a bi-directional streaming (to handle both input and output), which is not very appropriate for plain HTTP 1.0. Using websocket makes implementing the web ui a bit simpler. I considered using golang http hijack capability to treat http request as a plain connection, but the web interface would be too complicated potentially. Furthermore, the API endpoint operates against the raw core nomad exec streaming datastructures, defined in protobuf, with json serializer. Our APIs use json interfaces in general, and protobuf generates json friendly golang structs. Reusing the structs here simplify interface and reduce conversion overhead.	2019-05-09 16:49:08 -04:00
Danielle	4a22fa0ee2	Merge pull request #5536 from hashicorp/dani/consul Consul Catalog Integration Fixes	2019-05-09 13:22:54 +02:00
Danielle Lancashire	0da2924b2a	consul: Document example check id	2019-05-09 13:22:22 +02:00
Mahmood Ali	d405fcb093	fix flaky test by allowing for call invocation overhead	2019-05-08 18:04:37 -04:00
Preetha	1538913a2a	Merge pull request #5628 from hashicorp/f-preemption-config Add config to disable preemption for batch/service jobs	2019-05-06 15:40:35 -05:00
Lang Martin	9f3f11df97	Merge pull request #5601 from hashicorp/b-config-parse-direct-hcl config parse direct hcl	2019-05-06 12:05:19 -04:00
Preetha Appan	ad3c263d3f	Rename to match system scheduler config. Also added docs	2019-05-03 14:06:12 -05:00
Danielle Lancashire	d824e00d1a	consul: Do not deregister external checks This commit causes sync to skip deregistering checks that are not managed by nomad, such as service maintenance mode checks. This is handled in the same way as service registrations - by doing a Nomad specific prefix match.	2019-05-02 16:54:18 +02:00
Danielle Lancashire	0b8e85118e	consul: Use a stable identifier for services The current implementation of Service Registration uses a hash of the nomad-internal state of a service to register it with Consul, this means that any update to the service invalidates this name and we then deregister, and recreate the service in Consul. While this behaviour slightly simplifies reasoning about service registration, this becomes problematic when we add consul health checks to a service. When the service is re-registered, so are the checks, which default to failing for at least one check period. This commit migrates us to using a stable identifier based on the allocation, task, and service identifiers, and uses the difference between the remote and local state to decide when to push updates. It uses the existing hashing mechanic to decide when UpdateTask should regenerate service registrations for providing to Sync, but this should be removable as part of a future refactor. It additionally introduces the _nomad-check- prefix for check definitions, to allow for future allowing of consul features like maintenance mode.	2019-05-02 16:54:18 +02:00
Chris Baker	a40477a7b8	test case for 5540 (#5590 ) * client/metrics: modified metrics to use (updated) client copy of allocation instead of (unupdated) server copy * updated armon/go-metrics to address race condition in DisplayMetrics	2019-04-30 10:31:35 -04:00
Lang Martin	2e643d26a2	config_parse leave the *HCL strings in place after converting times	2019-04-30 10:30:53 -04:00
Lang Martin	3ba6095fe3	config_parse_test additional config confirmation w/ sample json	2019-04-30 10:30:53 -04:00
Lang Martin	fe9b31dcf9	config comment for future changes	2019-04-30 10:30:53 -04:00
Lang Martin	598112a1cc	tag HCL bookkeeping keys with json:"-" to keep them out of the api	2019-04-30 10:29:14 -04:00
Lang Martin	43407cffe3	config_parse_test remove redundant parse direct test	2019-04-30 10:29:14 -04:00
Lang Martin	b8e9c35cd0	config_parse remove unused multi-stage parsing via mapstructure	2019-04-30 10:29:14 -04:00
Lang Martin	1f86770456	config_parse_test test direct hcl parsing	2019-04-30 10:29:14 -04:00
Lang Martin	5ebae65d1a	agent/config, config/* mapstructure tags -> hcl tags	2019-04-30 10:29:14 -04:00
Lang Martin	92fd988c9f	config_parse add new ParseConfigFileDirectHCL - parse by using hcl.Decode directly - handle time.Duration strings in a second pass - report unexpected keys in a third pass	2019-04-30 10:29:14 -04:00
Preetha Appan	6615d5c868	Add config to disable preemption for batch/service jobs	2019-04-29 18:48:07 -05:00
Danielle Lancashire	a8880f9643	alloc_signal: Add autcompletion and cmd tests	2019-04-26 12:47:53 +02:00
Danielle Lancashire	3409e0be89	allocs: Add nomad alloc signal command This command will be used to send a signal to either a single task within an allocation, or all of the tasks if <task-name> is omitted. If the sent signal terminates the allocation, it will be treated as if the allocation has crashed, rather than as if it was operator-terminated. Signal validation is currently handled by the driver itself and nomad does not attempt to restrict or validate them.	2019-04-25 12:43:32 +02:00
Mahmood Ali	60ee243149	fix crash when executor parent nomad process dies Fixes https://github.com/hashicorp/nomad/issues/5593 Executor seems to die unexpectedly after nomad agent dies or is restarted. The crash seems to occur at the first log message after the nomad agent dies. To ease debugging we forward executor log messages to executor.log as well as to Stderr. `go-plugin` sets up plugins with Stderr pointing to a pipe being read by plugin client, the nomad agent in our case[1]. When the nomad agent dies, the pipe is closed, and any subsequent executor logs fail with ErrClosedPipe and SIGPIPE signal. SIGPIPE results into executor process dying. I considered adding a handler to ignore SIGPIPE, but hc-log library currently panics when logging write operation fails[2] This we opt to revert to v0.8 behavior of exclusively writing logs to executor.log, while we investigate alternative options. [1] https://github.com/hashicorp/nomad/blob/v0.9.0/vendor/github.com/hashicorp/go-plugin/client.go#L528-L535 [2] https://github.com/hashicorp/nomad/blob/v0.9.0/vendor/github.com/hashicorp/go-hclog/int.go#L320-L323	2019-04-23 09:52:46 -04:00
Danielle	198a838b61	Merge pull request #5512 from hashicorp/dani/f-alloc-stop alloc-lifecycle: nomad alloc stop	2019-04-23 13:05:08 +02:00
Danielle Lancashire	832f607433	allocs: Add nomad alloc stop This adds a `nomad alloc stop` command that can be used to stop and force migrate an allocation to a different node. This is built on top of the AllocUpdateDesiredTransitionRequest and explicitly limits the scope of access to that transition to expose it under the alloc-lifecycle ACL. The API returns the follow up eval that can be used as part of monitoring in the CLI or parsed and used in an external tool.	2019-04-23 12:50:23 +02:00
Michael Schurter	373748a327	Merge pull request #5486 from hashicorp/b-validate-migrate api: fix migrate stanza initialization	2019-04-15 09:44:59 -07:00

1 2 3 4 5 ...

2315 commits