open-nomad

Commit Graph

Author	SHA1	Message	Date
Michael Schurter	9e49ed3464	ar: AllocState should not mutate ar.state If ar.state.TaskStates has not been set, set it on the copy of ar.state. That keeps ar.state manipulations in one location and allows AllocState to only acquire read-locks.	2018-10-16 16:56:55 -07:00
Michael Schurter	f279b1d1b1	tests: test logs endpoint against pending task Although the really exciting change is making WaitForRunning return the allocations that it started. This should cut down test boilerplate significantly.	2018-10-16 16:56:55 -07:00
Michael Schurter	dd4227f84a	tests: make a test client/config easier to generate Sadly can't move the fingerprint timeout tweak into the helper due to circular imports.	2018-10-16 16:56:55 -07:00
Michael Schurter	1d747048ea	tests: ensure task state is initialized in NewAR Also expose NoopDB for use in tests.	2018-10-16 16:56:55 -07:00
Michael Schurter	6bcf772f3c	tests: test via ServeMux so http codes are set	2018-10-16 16:56:55 -07:00
Michael Schurter	960f3be76c	client: expose task state to client The interesting decision in this commit was to expose AR's state and not a fully materialized Allocation struct. AR.clientAlloc builds an Alloc that contains the task state, so I considered simply memoizing and exposing that method. However, that would lead to AR having two awkwardly similar methods: - Alloc() - which returns the server-sent alloc - ClientAlloc() - which returns the fully materialized client alloc Since ClientAlloc() could be memoized it would be just as cheap to call as Alloc(), so why not replace Alloc() entirely? Replacing Alloc() entirely would require Update() to immediately materialize the task states on server-sent Allocs as there may have been local task state changes since the server received an Alloc update. This quickly becomes difficult to reason about: should Update hooks use the TaskStates? Are state changes caused by TR Update hooks immediately reflected in the Alloc? Should AR persist its copy of the Alloc? If so, are its TaskStates canonical or the TaskStates on TR? So! Forget that. Let's separate the static Allocation from the dynamic AR & TR state! - AR.Alloc() is for static Allocation access (often for the Job) - AR.AllocState() is for the dynamic AR & TR runtime state (deployment status, task states, etc). If code needs to know the status of a task: AllocState() If code needs to know the names of tasks: Alloc() It should be very easy for a developer to reason about which method they should call and what they can do with the return values.	2018-10-16 16:56:55 -07:00
Michael Schurter	fb4aa74153	client: add comment	2018-10-16 16:56:55 -07:00
Michael Schurter	9a7e6be2b6	client: fix potentially dropped streaming errors	2018-10-16 16:56:55 -07:00
Michael Schurter	4b44b9039b	tr: remove unneeded lock; chan synchronizes access	2018-10-16 16:56:55 -07:00
Michael Schurter	1c9ccdeab5	tests: fix races caused by sharing a buffer httptest.ResponseRecorder exposes a bytes.Buffer which we were reading and writing concurrently to test streaming log APIs. This is a race, so I wrapped the struct in a lock with some helpers.	2018-10-16 16:56:55 -07:00
Michael Schurter	211b96bb5c	tr: fix shutdown/destroy/WaitResult handling Multiple receivers raced for the WaitResult when killing tasks which could lead to a deadlock if the "wrong" receiver won. Wrap handlers in an ugly little proxy to avoid this. At first I wanted to push this into drivers, but the result is tied to the TR's handle lifecycle -- not the lifecycle of an alloc or task.	2018-10-16 16:56:55 -07:00
Michael Schurter	951ed17436	client: do not inspect task state to follow logs "Ask forgiveness, not permission." Instead of peaking at TaskStates (which are no longer updated on the AR.Alloc() view of the world) to only read logs for running tasks, just try to read the logs and improve the error handling if they don't exist. This should make log streaming less dependent on AR/TR behavior. Also fixed a race where the log streamer could exit before reading an error. This caused no logs or errors to be displayed sometimes when an error occurred.	2018-10-16 16:56:55 -07:00
Michael Schurter	2325348053	mock_driver: close waitCh after exiting mock_driver wasn't behaving like other driver handles.	2018-10-16 16:56:55 -07:00
Michael Schurter	8d1419c62b	client: fix accessing alloc runners * GetClientAlloc() gains nothing from using allAllocs() * getAllocatedResources was calling getAllocRunners() twice	2018-10-16 16:56:55 -07:00
Michael Schurter	55ab491801	tr: remove wip comments	2018-10-16 16:56:55 -07:00
Michael Schurter	3ccc091a72	ar: lock around accessing tasks Specify that Alloc() does not return updated task states.	2018-10-16 16:56:55 -07:00
Alex Dadgar	84ce8c3487	extra logging	2018-10-16 16:56:55 -07:00
Alex Dadgar	6f0ed6184b	Fix client reloading and pass the plugin loaders to server and client	2018-10-16 16:56:55 -07:00
Alex Dadgar	183561cf82	Plugin loader initialization	2018-10-16 16:54:12 -07:00
Alex Dadgar	cc76555814	Internal plugin catalog	2018-10-16 16:53:31 -07:00
Nick Ethier	a95cbc38ba	drivers/raw_exec: sync access to task state	2018-10-16 16:53:31 -07:00
Nick Ethier	28e8e5852c	drivers/raw_exec: added unix specific tests	2018-10-16 16:53:31 -07:00
Nick Ethier	352c05cdf4	plugin/drivers: plumb in stdout/stderr paths	2018-10-16 16:53:31 -07:00
Nick Ethier	1f6873806e	raw_exec: move package outside of plugins dir	2018-10-16 16:53:31 -07:00
Nick Ethier	8b876e1cce	fix package references after drivers/base subpackage removed	2018-10-16 16:53:31 -07:00
Nick Ethier	0e3f85222a	driver/raw_exec: port existing raw_exec tests and add some testing utilities	2018-10-16 16:53:31 -07:00
Nick Ethier	8644e8508c	driver/raw_exec: export driver config fields so they are encoded	2018-10-16 16:53:31 -07:00
Nick Ethier	3c17f50b29	lint: remove unused code and fix spelling	2018-10-16 16:53:31 -07:00
Nick Ethier	d9628ff394	driver/raw_exec: more tests and bug fixes added wrapper struct for plugin.ReattachConfig to better handle serialization	2018-10-16 16:53:31 -07:00
Nick Ethier	5617f3615b	driver/raw_exec: initial raw_exec implementation	2018-10-16 16:53:31 -07:00
Nick Ethier	bcc5c4a8bd	clientv2: base driver plugin (#4671 ) Driver plugin framework to facilitate development of driver plugins. Implementing plugins only need to implement the DriverPlugin interface. The framework proxies this interface to the go-plugin GRPC interface generated from the driver.proto spec. A testing harness is provided to allow implementing drivers to test the full lifecycle of the driver plugin. An example use: func TestMyDriver(t *testing.T) { harness := NewDriverHarness(t, &MyDiverPlugin{}) // The harness implements the DriverPlugin interface and can be used as such taskHandle, err := harness.StartTask(...) }	2018-10-16 16:53:31 -07:00
Michael Schurter	62c1285afc	tr: add comments and cleanup call signature From review comments on #4649 left post-merge.	2018-10-16 16:53:31 -07:00
Nick Ethier	5dee1141d1	executor v2 (#4656 ) * client/executor: refactor client to remove interpolation * executor: POC libcontainer based executor * vendor: use hashicorp libcontainer fork * vendor: add libcontainer/nsenter dep * executor: updated executor interface to simplify operations * executor: implement logging pipe * logmon: new logmon plugin to manage task logs * driver/executor: use logmon for log management * executor: fix tests and windows build * executor: fix logging key names * executor: fix test failures * executor: add config field to toggle between using libcontainer and standard executors * logmon: use discover utility to discover nomad executable * executor: only call libcontainer-shim on main in linux * logmon: use seperate path configs for stdout/stderr fifos * executor: windows fixes * executor: created reusable pid stats collection utility that can be used in an executor * executor: update fifo.Open calls * executor: fix build * remove executor from docker driver * executor: Shutdown func to kill and cleanup executor and its children * executor: move linux specific universal executor funcs to seperate file * move logmon initialization to a task runner hook * client: doc fixes and renaming from code review * taskrunner: use shared config struct for logmon fifo fields * taskrunner: logmon only needs to be started once per task	2018-10-16 16:53:31 -07:00
Michael Schurter	e6e2930a00	tr: implement stats collection hook Tested except for the net/rpc specific error case which may need changing in the gRPC world.	2018-10-16 16:53:31 -07:00
Michael Schurter	86bd329539	fix build errors post merges	2018-10-16 16:53:31 -07:00
Michael Schurter	a977e22028	test: cleanup mock consul service client Updated to hclog. It exposed fields that required an unexported lock to access. Created a getter methodn instead. Only old allocrunner currently used this feature.	2018-10-16 16:53:31 -07:00
Michael Schurter	6f92b04226	health_hook: simplify locking; test thoroughly Use doneCh like @dadgar suggested in the original PR. Thoroughly test hook as concurrent Update calls make for a tricky concurrency problem.	2018-10-16 16:53:30 -07:00
Alex Dadgar	cebfead6bc	add logger back	2018-10-16 16:53:30 -07:00
Nick Ethier	03422aa529	fifo: add new fifo package for named pipes (#4665 ) * fifo: add new fifo package for named pipes	2018-10-16 16:53:30 -07:00
Alex Dadgar	8504505c0d	client uses passed logger and fix fingerprinters	2018-10-16 16:53:30 -07:00
Nick Ethier	66ff12e5f7	Update runc/libcontainer and friends (#4655 ) * vendor: bump libcontainer and docker to remove Sirupsen imports * vendor: fix bad vendoring of archive package * vendor: fix api changes to cgroups in executor * vendor: fix docker api changes * vendor: update github.com/Azure/go-ansiterm to use non capitalized logrus import	2018-10-16 16:53:30 -07:00
Michael Schurter	195b8127fb	health_hook: fix panic and add tests Still more testing to do, but I want to get this panic fixed ASAP. All new tests pass with -race	2018-10-16 16:53:30 -07:00
Michael Schurter	64efc3d301	Emit events before long operations Append when there's nothing blocking between appending and sending an update to the server.	2018-10-16 16:53:30 -07:00
Michael Schurter	a2b696c4cf	Use a semaphore to block until watcher exits	2018-10-16 16:53:30 -07:00
Michael Schurter	a73162c977	ar: use multierror in update hook loop Make it match TaskRunner update hook behavior	2018-10-16 16:53:30 -07:00
Michael Schurter	a7b427718c	tr: refactor EmitEvents into Emit+Append * UpdateState: set state, append event, persist, update servers * EmitEvent: append event, persist, update servers * AppendEvent: append event, persist AppendEvent may not even have to persist, but for the sake of correctness I'm going with that for now.	2018-10-16 16:53:30 -07:00
Michael Schurter	93f3ac9ed6	ar: create health setting shim for health watcher	2018-10-16 16:53:30 -07:00
Michael Schurter	4d5aaac6d2	fix detection of task transitioning to running	2018-10-16 16:53:30 -07:00
Michael Schurter	4136e59f79	arv2: implement alloc health watching Also remove initial alloc from broadcaster as it just caused useless extra processing.	2018-10-16 16:53:30 -07:00
Michael Schurter	5c5c6dc41b	refactor ar hooks into their own files minimize passed dependencies to ease testing	2018-10-16 16:53:30 -07:00

1 2 3 4 5 ...

12812 Commits All Branches Search

12812 Commits

All Branches