open-nomad

Author	SHA1	Message	Date
Michael Schurter	cdcefd0908	Use the Service.Hash() method in agent service ids The allocID and taskName parameters are useless for agents, but it's still nice to reuse the same hash method for agent and task services. This brings in the lowercase mode for the agent hash as well.	2017-12-11 16:50:15 -08:00
Michael Schurter	4f1002c1a8	Be more defensive in port checks	2017-12-08 12:27:57 -08:00
Michael Schurter	d613e0aaf5	Move service hash logic to Service.Hash method	2017-12-08 12:03:43 -08:00
Michael Schurter	b71edf846f	Hash fields used in task service IDs Fixes #3620 Previously we concatenated tags into task service IDs. This could break deregistration of tag names that contained double //s like some Fabio tags. This change breaks service ID backward compatibility so on upgrade all users services and checks will be removed and re-added with new IDs. This change has the side effect of including all service fields in the ID's hash, so we no longer have to track PortLabel and AddressMode changes independently.	2017-12-08 12:03:43 -08:00
Michael Schurter	91282315d1	Prevent using port 0 with address_mode=driver	2017-12-08 12:03:43 -08:00
Michael Schurter	4b20441eef	Validate port label for host address mode Also skip getting an address for script checks which don't use them. Fixed a weird invalid reserved port in a TaskRunner test helper as well as a problem with our mock Alloc/Job. Hopefully the latter doesn't cause other tests to fail, but we were referencing an invalid PortLabel and just not catching it before.	2017-12-08 12:03:43 -08:00
Michael Schurter	4347026f83	Test Consul from TaskRunner thoroughly Rely less on the mockConsulServiceClient because the real consul.ServiceClient needs all the testing it can get!	2017-12-08 12:03:00 -08:00
Michael Schurter	4ae115dc59	Allow custom ports for services and checks Fixes #3380 Adds address_mode to checks (but no auto) and allows services and checks to set literal port numbers when using address_mode=driver. This allows SDNs, overlays, etc to advertise internal and host addresses as well as do checks against either.	2017-12-08 12:03:00 -08:00
Jens Herrmann	5680fcccc2	Fix typos in metric names. #3610	2017-12-01 15:24:14 +01:00
Michael Schurter	0aace3d749	Don't set Interval on TTL health checks	2017-10-16 17:35:47 -07:00
Alex Dadgar	4173834231	Enable more linters	2017-09-26 15:26:33 -07:00
Michael Schurter	a844fba8d2	Fix comments: task -> check	2017-09-15 15:19:53 -07:00
Michael Schurter	0f2a3dcec9	Test check watch updates	2017-09-14 16:48:39 -07:00
Michael Schurter	847fe080f6	Rename unhealthy var and fix test indeterminism	2017-09-14 16:48:39 -07:00
Michael Schurter	573a0df03d	Watched -> TriggersRestart Watched was a silly name	2017-09-14 16:48:39 -07:00
Michael Schurter	4ea19baa52	Handle multiple failing checks on a single task Before this commit if a task had 2 checks cause restarts at the same time, both would trigger restarts of the task! This change removes all checks for a task whenever one of them is restarted.	2017-09-14 16:48:39 -07:00
Michael Schurter	73fb71ca10	RestartDelay isn't needed as checks are re-added on restarts @dadgar made the excellent observation in #3105 that TaskRunner removes and re-registers checks on restarts. This means checkWatcher doesn't need to do any internal restart tracking. Individual checks can just remove themselves and be re-added when the task restarts.	2017-09-14 16:48:39 -07:00
Michael Schurter	448ad3945f	Simplify from 2 select loops to one	2017-09-14 16:48:39 -07:00
Michael Schurter	550e631eea	Wrap check watch updates in a struct Reusing checkRestart for both adds/removes and the main check restarting logic was confusing.	2017-09-14 16:48:39 -07:00
Michael Schurter	72e5c0c0aa	Fix whitespace	2017-09-14 16:47:41 -07:00
Michael Schurter	ade29ecbed	Improve check watcher logging and add tests Also expose a mock Consul Agent to allow testing ServiceClient and checkWatcher from TaskRunner without actually talking to a real Consul.	2017-09-14 16:47:41 -07:00
Michael Schurter	a137676358	Add comments and move delay calc to TaskRunner	2017-09-14 16:46:54 -07:00
Michael Schurter	a180c00fc3	on_warning=false -> ignore_warnings=false Treat warnings as unhealthy by default	2017-09-14 16:46:54 -07:00
Michael Schurter	8a87475498	Use existing restart policy infrastructure	2017-09-14 16:46:54 -07:00
Michael Schurter	22690c5f4c	Add check watcher for restarting unhealthy tasks	2017-09-14 16:46:54 -07:00
Michael Schurter	7f6e1f3a9c	Initializing embedded structs is weird	2017-08-17 16:49:14 -07:00
Michael Schurter	0634eef12a	Test createCheckReg	2017-08-17 16:49:14 -07:00
Michael Schurter	bb8d5689d8	Add Header and Method support for HTTP checks	2017-08-17 16:44:21 -07:00
Alex Dadgar	43dff0a11d	Fix integration test	2017-08-14 10:52:49 -07:00
Alex Dadgar	6e20acb503	Merge pull request #2984 from hashicorp/b-tags Fix alloc health with checks using interpolation	2017-08-10 13:07:25 -07:00
Alex Dadgar	c8f74ac43b	Address comments	2017-08-10 13:07:08 -07:00
Alex Dadgar	d86b3977b9	Fix alloc health with checks using interpolation Fixes an issue in which the allocation health watcher was checking for allocations health based on un-interpolated services and checks. Change the interface for retrieving check information from Consul to retrieving all registered services and checks by allocation. In the future this will allow us to output nicer messages. Fixes https://github.com/hashicorp/nomad/issues/2969	2017-08-07 16:27:08 -07:00
Luke Farnell	f0ced87b95	fixed all spelling mistakes for goreport	2017-08-07 17:13:05 -04:00
Michael Schurter	5794e5ece7	Use int32 for atomic ops to avoid alignment issues From https://golang.org/pkg/sync/atomic/#pkg-note-BUG : On both ARM and x86-32, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a global variable or in an allocated struct or slice can be relied upon to be 64-bit aligned.	2017-08-04 10:14:16 -07:00
Michael Schurter	d2f8fdcad5	Fix comment	2017-07-25 12:13:05 -07:00
Michael Schurter	3e6231842d	Forgot to setcmdenv This would leak a consul agent	2017-07-25 12:09:57 -07:00
Michael Schurter	4b83eba599	Use seen more conservatively	2017-07-24 16:48:40 -07:00
Michael Schurter	cdf138eb27	Always increment failures... ...as it's used in calculating the backoff	2017-07-24 15:37:53 -07:00
Michael Schurter	809724ad8d	Track whether Consul has ever been seen Need a way to squelch Consul operation errors on shutdown. If it's never been seen don't log errors about deregs failing.	2017-07-24 12:12:02 -07:00
Michael Schurter	edbe62a879	Synchronously deregister agent on shutdown Fixes #2891 Previously the agent services and checks were being asynchrously deregistered on shutdown, so it was a race between the sync goroutine deregistering them and Nomad shutting down. This switches to synchronously deregister agent serivces and checks which doesn't really have a downside since the sync goroutines retry behavior doesn't help on shutdown anyway.	2017-07-24 11:40:37 -07:00
Alex Dadgar	553bc91725	Parallel client tests (#2890 ) * alloc_runner * Random tests * parallel task_runner and no exec compatible check * Parallel client * Fail fast and use random ports * Fix docker port mapping * Make concurrent pull less timing dependant * up parallel * Fixes * don't build chroots in parallel on travis * Reduce parallelism on travis with lxc/rkt * make java test app not run forever * drop parallelism a little * use docker ports that are out of the os's ephemeral port range * Limit even more on travis * rkt deadline	2017-07-22 19:04:36 -07:00
Alex Dadgar	4dd5d943c7	remove root requirement on consul integration check	2017-07-21 19:32:41 -07:00
Michael Schurter	125a3fb2f9	Error -> Errof	2017-07-19 10:00:57 -07:00
Michael Schurter	99d1486f32	Never remove unknown agent services Fixes #2827 This is a tradeoff. The pro is that you can run separate client and server agents on the same node and advertise both. The con is that if a Nomad agent crashes and isn't restarted on that node in the same mode its entry will not be cleaned up. That con scenario seems far less likely to occur than the scenario on the pro side, and even if we do leak an agent entry the checks will be failing so nothing should attempt to use it.	2017-07-18 13:23:01 -07:00
Alex Dadgar	bf2dafb8e9	check id method name changed	2017-07-07 12:15:09 -07:00
Alex Dadgar	067ed86a47	Client watches for allocation health using task state and Consul checks This PR adds watching of allocation health at the client. The client can watch for health based on the tasks running on time and also based on the consul checks passing.	2017-07-07 12:10:04 -07:00
Michael Schurter	a863ead30e	Fix test error formats	2017-06-26 12:53:43 -07:00
Michael Schurter	9da78ae25f	Remove debug logging	2017-06-21 17:19:08 -07:00
Michael Schurter	c0eff81383	Fix Service.AddressMode changes during task updates	2017-06-21 17:19:08 -07:00
Michael Schurter	67d154a274	Test driver network advertisement and checks	2017-06-21 17:19:08 -07:00

1 2 3

140 commits