open-nomad

Author	SHA1	Message	Date
Nick Ethier	7de0bec8ab	client/cni: updated comments and simplified logic to auto download plugins	2019-07-31 01:04:10 -04:00
Nick Ethier	b16640c50d	Apply suggestions from code review Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>	2019-07-31 01:04:10 -04:00
Nick Ethier	321d10a041	client: remove debugging lines	2019-07-31 01:04:09 -04:00
Nick Ethier	af6b191963	client: add autofetch for CNI plugins	2019-07-31 01:04:09 -04:00
Nick Ethier	1e9dd1b193	remove unused file	2019-07-31 01:04:09 -04:00
Nick Ethier	09a4cfd8d7	fix failing tests	2019-07-31 01:04:07 -04:00
Nick Ethier	ef83f0831b	ar: plumb client config for networking into the network hook	2019-07-31 01:04:06 -04:00
Nick Ethier	af66a35924	networking: Add new bridge networking mode implementation	2019-07-31 01:04:06 -04:00
Michael Schurter	fb487358fb	connect: add group.service stanza support	2019-07-31 01:04:05 -04:00
Nick Ethier	63c5504d56	ar: fix lint errors	2019-07-31 01:03:19 -04:00
Nick Ethier	e312201d18	ar: rearrange network hook to support building on windows	2019-07-31 01:03:19 -04:00
Nick Ethier	370533c9c7	ar: fix test that failed due to error renaming	2019-07-31 01:03:19 -04:00
Nick Ethier	2d60ef64d9	plugins/driver: make DriverNetworkManager interface optional	2019-07-31 01:03:19 -04:00
Nick Ethier	f87e7e9c9a	ar: plumb error handling into alloc runner hook initialization	2019-07-31 01:03:18 -04:00
Nick Ethier	ef1795b344	ar: add tests for network hook	2019-07-31 01:03:18 -04:00
Nick Ethier	15989bba8e	ar: cleanup lint errors	2019-07-31 01:03:18 -04:00
Nick Ethier	220cba3e7e	ar: move linux specific code to it's own file and add tests	2019-07-31 01:03:18 -04:00
Nick Ethier	548f78ef15	ar: initial driver based network management	2019-07-31 01:03:17 -04:00
Nick Ethier	66c514a388	Add network lifecycle management Adds a new Prerun and Postrun hooks to manage set up of network namespaces on linux. Work still needs to be done to make the code platform agnostic and support Docker style network initalization.	2019-07-31 01:03:17 -04:00
Preetha Appan	d048029b5a	remove generated code and change version to 0.10.0	2019-07-30 15:56:05 -05:00
Nomad Release bot	e39fb11531	Generate files for 0.9.4 release	2019-07-30 19:05:18 +00:00
Preetha Appan	6b4c40f5a8	remove generated code	2019-07-23 12:07:49 -05:00
Nomad Release bot	04187c8b86	Generate files for 0.9.4-rc1 release	2019-07-22 21:42:36 +00:00
Michael Schurter	d90680021e	logmon: fix comment formattinglogmon: fix comment formattinglogmon: fix comment formattinglogmon: fix comment formattinglogmon: fix comment formatting	2019-07-22 13:05:01 -07:00
Michael Schurter	e37bc3513c	logmon: ensure errors are still handled properly ...and add a comment to switch back to the old error handling once we switch to Go 1.12.	2019-07-22 12:49:48 -07:00
Danielle Lancashire	1bcbbbfbe6	logmon: Workaround golang/go#29119 There's a bug in go1.11 that causes some io operations on windows to return incorrect errors for some cases when Stat-ing files. To avoid upgrading to go1.12 in a point release, here we loosen up the cases where we will attempt to create fifos, and add some logging of underlying stat errors to help with debugging.	2019-07-22 18:28:12 +02:00
Jasmine Dahilig	2157f6ddf1	add formatting for hcl parsing error messages (#5972 )	2019-07-19 10:04:39 -07:00
Mahmood Ali	cd6f1d3102	Update consul-template dependency to latest To pick up the fix in https://github.com/hashicorp/consul-template/pull/1231 .	2019-07-18 07:32:03 +07:00
Mahmood Ali	8a82260319	log unrecoverable errors	2019-07-17 11:01:59 +07:00
Mahmood Ali	1a299c7b28	client/taskrunner: fix stats stats retry logic Previously, if a channel is closed, we retry the Stats call. But, if that call fails, we go in a backoff loop without calling Stats ever again. Here, we use a utility function for calling driverHandle.Stats call that retries as one expects. I aimed to preserve the logging formats but made small improvements as I saw fit.	2019-07-11 13:58:07 +08:00
Preetha Appan	7d645c5ad9	Test file for detect content type that satisfies linter and encoding	2019-07-10 11:42:04 -05:00
Preetha Appan	ef9a71c68b	code review feedback	2019-07-10 10:41:06 -05:00
Preetha Appan	990e468edc	Populate task event struct with kill timeout This makes for a nicer task event message	2019-07-09 09:37:09 -05:00
Preetha Appan	108a292cc0	fix linting failure in test case file	2019-07-08 11:29:12 -05:00
Michael Lange	b2e9570075	Use consistent casing in the JSON representation of the AllocFileInfo struct	2019-07-02 17:27:31 -07:00
Preetha Appan	8495fb9055	Added additional test cases and fixed go test case	2019-07-02 13:25:29 -05:00
Mahmood Ali	a97d451ac7	Merge pull request #5905 from hashicorp/b-ar-failed-prestart Fail alloc if alloc runner prestart hooks fail	2019-07-02 20:25:53 +08:00
Danielle	c6872cdf12	Merge pull request #5864 from hashicorp/dani/win-pipe-cleaner windows: Fix restarts using the raw_exec driver	2019-07-02 13:58:56 +02:00
Danielle Lancashire	e20300313f	fifo: Safer access to Conn	2019-07-02 13:12:54 +02:00
Mahmood Ali	f10201c102	run post-run/post-stop task runner hooks Handle when prestart failed while restoring a task, to prevent accidentally leaking consul/logmon processes.	2019-07-02 18:38:32 +08:00
Mahmood Ali	4afd7835e3	Fail alloc if alloc runner prestart hooks fail When an alloc runner prestart hook fails, the task runners aren't invoked and they remain in a pending state. This leads to terrible results, some of which are: * Lockup in GC process as reported in https://github.com/hashicorp/nomad/pull/5861 * Lockup in shutdown process as TR.Shutdown() waits for WaitCh to be closed * Alloc not being restarted/rescheduled to another node (as it's still in pending state) * Unexpected restart of alloc on a client restart, potentially days/weeks after alloc expected start time! Here, we treat all tasks to have failed if alloc runner prestart hook fails. This fixes the lockups, and permits the alloc to be rescheduled on another node. While it's desirable to retry alloc runner in such failures, I opted to treat it out of scope. I'm afraid of some subtles about alloc and task runners and their idempotency that's better handled in a follow up PR. This might be one of the root causes for https://github.com/hashicorp/nomad/issues/5840 .	2019-07-02 18:35:47 +08:00
Mahmood Ali	7614b8f09e	Merge pull request #5890 from hashicorp/b-dont-start-completed-allocs-2 task runner to avoid running task if terminal	2019-07-02 15:31:17 +08:00
Mahmood Ali	7bfad051b9	address review comments	2019-07-02 14:53:50 +08:00
Mahmood Ali	c0c00ecc07	Merge pull request #5906 from hashicorp/b-alloc-stale-updates client: defensive against getting stale alloc updates	2019-07-02 12:40:17 +08:00
Preetha Appan	c09342903b	Improve test cases for detecting content type	2019-07-01 16:24:48 -05:00
Danielle Lancashire	688f82f07d	fifo: Close connections and cleanup lock handling	2019-07-01 14:14:29 +02:00
Danielle Lancashire	2c7d1f1b99	logmon: Add windows compatibility test	2019-07-01 14:14:06 +02:00
Mahmood Ali	c5f5a1fcb9	client: defensive against getting stale alloc updates When fetching node alloc assignments, be defensive against a stale read before killing local nodes allocs. The bug is when both client and servers are restarting and the client requests the node allocation for the node, it might get stale data as server hasn't finished applying all the restored raft transaction to store. Consequently, client would kill and destroy the alloc locally, just to fetch it again moments later when server store is up to date. The bug can be reproduced quite reliably with single node setup (configured with persistence). I suspect it's too edge-casey to occur in production cluster with multiple servers, but we may need to examine leader failover scenarios more closely. In this commit, we only remove and destroy allocs if the removal index is more recent than the alloc index. This seems like a cheap resiliency fix we already use for detecting alloc updates. A more proper fix would be to ensure that a nomad server only serves RPC calls when state store is fully restored or up to date in leadership transition cases.	2019-06-29 04:17:35 -05:00
Preetha Appan	3345ce3ba4	Infer content type in alloc fs stat endpoint	2019-06-28 20:31:28 -05:00
Danielle Lancashire	e1151f743b	appveyor: Run logmon tests	2019-06-28 16:01:41 +02:00

1 2 3 4 5 ...

3820 commits