open-nomad

Commit Graph

Author	SHA1	Message	Date
Mahmood Ali	a9d5e4c510	scheduler: stopped-yet-running allocs are still running (#10446 ) * scheduler: stopped-yet-running allocs are still running * scheduler: test new stopped-but-running logic * test: assert nonoverlapping alloc behavior Also add a simpler Wait test helper to improve line numbers and save few lines of code. * docs: tried my best to describe #10446 it's not concise... feedback welcome * scheduler: fix test that allowed overlapping allocs * devices: only free devices when ClientStatus is terminal * test: output nicer failure message if err==nil Co-authored-by: Mahmood Ali <mahmood@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-09-13 12:52:47 -07:00
Luiz Aoqui	a8cc633156	vault: revert support for entity aliases (#12723 ) After a more detailed analysis of this feature, the approach taken in PR #12449 was found to be not ideal due to poor UX (users are responsible for setting the entity alias they would like to use) and issues around jobs potentially masquerading itself as another Vault entity.	2022-04-22 10:46:34 -04:00
Luiz Aoqui	ab7eb5de6e	Support Vault entity aliases (#12449 ) Move some common Vault API data struct decoding out of the Vault client so it can be reused in other situations. Make Vault job validation its own function so it's easier to expand it. Rename the `Job.VaultPolicies` method to just `Job.Vault` since it returns the full Vault block, not just their policies. Set `ChangeMode` on `Vault.Canonicalize`. Add some missing tests. Allows specifying an entity alias that will be used by Nomad when deriving the task Vault token. An entity alias assigns an indentity to a token, allowing better control and management of Vault clients since all tokens with the same indentity alias will now be considered the same client. This helps track Nomad activity in Vault's audit logs and better control over Vault billing. Add support for a new Nomad server configuration to define a default entity alias to be used when deriving Vault tokens. This default value will be used if the task doesn't have an entity alias defined.	2022-04-05 14:18:10 -04:00
Seth Hoenig	2631659551	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
Michael Schurter	e6eff95769	agent: validate reserved_ports are valid Goal is to fix at least one of the causes that can cause a node to be ineligible to receive work: https://github.com/hashicorp/nomad/issues/9506#issuecomment-1002880600	2022-01-12 14:21:47 -08:00
Mahmood Ali	0c2551270a	oversubscription: Add MemoryMaxMB to internal structs Start tracking a new MemoryMaxMB field that represents the maximum memory a task may use in the client. This allows tasks to specify a memory reservation (to be used by scheduler when placing the task) but use excess memory used on the client if the client has any. This commit adds the server tracking for the value, and ensures that allocations AllocatedResource fields include the value.	2021-03-30 16:55:58 -04:00
Nick Ethier	648ade63ad	scheduler: implement scheduling of reserved cores	2021-03-19 00:29:07 -04:00
Charlie Voiselle	0473f35003	Fixup uses of `sanity` (#10187 ) * Fixup uses of `sanity` * Remove unnecessary comments. These checks are better explained by earlier comments about the context of the test. Per @tgross, moved the tests together to better reinforce the overall shared context. * Update nomad/fsm_test.go	2021-03-16 18:05:08 -04:00
Kris Hicks	d71a90c8a4	Fix some errcheck errors (#9811 ) * Throw away result of multierror.Append When given a multierror.Error, it is mutated, therefore the return value is not needed. Simplify MergeMultierrorWarnings, use StringBuilder * Hash.Write() never returns an error * Remove error that was always nil * Remove error from Resources.Add signature When this was originally written it could return an error, but that was refactored away, and callers of it as of today never handle the error. * Throw away results of io.Copy during Bridge * Handle errors when computing node class in test	2021-01-14 12:46:35 -08:00
Nick Ethier	e0fb634309	ar: support opting into binding host ports to default network IP (#8321 ) * ar: support opting into binding host ports to default network IP * fix config plumbing * plumb node address into network resource * struct: only handle network resource upgrade path once	2020-07-06 18:51:46 -04:00
Nick Ethier	a87e91e971	test: fix up testing around host networks	2020-06-19 13:53:31 -04:00
Mahmood Ali	3da74068dd	changelog and fix typo	2020-05-01 13:14:20 -04:00
Mahmood Ali	b9e3cde865	tests and some clean up	2020-05-01 13:13:30 -04:00
Michael Schurter	4c5a0cae35	core: fix node reservation scoring The BinPackIter accounted for node reservations twice when scoring nodes which could bias scores toward nodes with reservations. Pseudo-code for previous algorithm: ``` proposed = reservedResources + sum(allocsResources) available = nodeResources - reservedResources score = 1 - (proposed / available) ``` The node's reserved resources are added to the total resources used by allocations, and then the node's reserved resources are later substracted from the node's overall resources. The new algorithm is: ``` proposed = sum(allocResources) available = nodeResources - reservedResources score = 1 - (proposed / available) ``` The node's reserved resources are no longer added to the total resources used by allocations. My guess as to how this bug happened is that the resource utilization variable (`util`) is calculated and returned by the `AllocsFit` function which needs to take reserved resources into account as a basic feasibility check. To avoid re-calculating alloc resource usage (because there may be a large number of allocs), we reused `util` in the `ScoreFit` function. `ScoreFit` properly accounts for reserved resources by subtracting them from the node's overall resources. However since `util` _also_ took reserved resources into account the score would be incorrect. Prior to the fix the added test output: ``` Node: reserved Score: 1.0000 Node: reserved2 Score: 1.0000 Node: no-reserved Score: 0.9741 ``` The scores being 1.0 for both nodes with reserved resources is a good hint something is wrong as they should receive different scores. Upon further inspection the double accounting of reserved resources caused their scores to be >1.0 and clamped. After the fix the added test outputs: ``` Node: no-reserved Score: 0.9741 Node: reserved Score: 0.9480 Node: reserved2 Score: 0.8717 ```	2020-04-15 15:13:30 -07:00
Nick Ethier	6c160df689	fix tests from introducing new struct fields	2019-07-31 01:03:16 -04:00
Alex Dadgar	fbe4d67d1b	fix iops related tests	2018-12-12 14:32:22 -08:00
Alex Dadgar	1e3c3cb287	Deprecate IOPS IOPS have been modelled as a resource since Nomad 0.1 but has never actually been detected and there is no plan in the short term to add detection. This is because IOPS is a bit simplistic of a unit to define the performance requirements from the underlying storage system. In its current state it adds unnecessary confusion and can be removed without impacting any users. This PR leaves IOPS defined at the jobspec parsing level and in the api/ resources since these are the two public uses of the field. These should be considered deprecated and only exist to allow users to stop using them during the Nomad 0.9.x release. In the future, there should be no expectation that the field will exist.	2018-12-06 15:09:26 -08:00
Alex Dadgar	36abd3a3d8	review comments	2018-11-07 10:33:22 -08:00
Alex Dadgar	e3cbb2c82e	allocs fit checks if devices get oversubscribed	2018-11-07 10:33:22 -08:00
Alex Dadgar	01f8e5b95f	renames	2018-10-04 14:57:25 -07:00
Alex Dadgar	bac5cb1e8b	Scheduler uses allocated resources	2018-10-02 17:08:25 -07:00
Alex Dadgar	e1a102f58c	test allocs fit	2018-09-24 13:59:01 -07:00
Alex Dadgar	6dd1c9f49d	Refactor	2018-02-15 13:59:00 -08:00
Michael Schurter	a66c53d45a	Remove `structs` import from `api` Goes a step further and removes structs import from api's tests as well by moving GenerateUUID to its own package.	2017-09-29 10:36:08 -07:00
Alex Dadgar	4173834231	Enable more linters	2017-09-26 15:26:33 -07:00
Armon Dadgar	fc23a4e7e5	structs: sort policies to avoid order dependence for caching	2017-09-04 13:05:36 -07:00
Armon Dadgar	98e0f98f7e	structs: Adding ACL compilation helper	2017-09-04 13:05:35 -07:00
Armon Dadgar	583e654246	structs: cache key helper for policy list	2017-09-04 13:05:35 -07:00
Diptanu Choudhury	e927de02d2	Moved functions to helper from structs	2017-01-18 15:55:14 -08:00
Alex Dadgar	ffefb6d3c1	Fix flaky test	2016-10-27 11:48:00 -07:00
Alex Dadgar	aadc9e3017	Add implicit signal constraint and validate that a driver can handle the signal. Also fixes a bug with plan and implicit constraints by adding them to the job being planned	2016-10-20 13:55:35 -07:00
Diptanu Choudhury	1b3c5e98c8	Renaming LocalDisk to EphemeralDisk (#1710 ) Renaming LocalDisk to EphemeralDisk	2016-09-14 15:43:42 -07:00
Diptanu Choudhury	64c57d9136	Added a test	2016-08-31 13:40:43 -07:00
Diptanu Choudhury	52e9946da9	Implemented SetPrefferingNodes in stack	2016-08-30 16:17:50 -07:00
Diptanu Choudhury	3dec7cd2c9	Added LocalDisk to diff	2016-08-26 20:38:50 -07:00
Alex Dadgar	9bd9948c5b	Job Register endpoint validates token	2016-08-17 16:25:38 -07:00
Alex Dadgar	034bae90bb	Revert "Remove client status from allocation TerminalStatus" This reverts commit 819e1e4b3967c7029ee8221144666ff460fdd7ed.	2016-04-08 14:22:06 -07:00
Alex Dadgar	09f63fd3c0	Remove client status from allocation TerminalStatus	2016-03-25 12:53:37 -07:00
Alex Dadgar	94522e7bed	Successful allocations are marked as complete instead of dead	2016-03-23 18:08:19 -07:00
Alex Dadgar	92823b71a8	merge	2015-12-16 15:01:15 -08:00
Alex Dadgar	2218a79815	Add garbage collection to jobs	2015-12-16 15:00:45 -08:00
Diptanu Choudhury	0238e27d5d	Fixed the structs test	2015-11-16 13:10:57 -08:00
Armon Dadgar	b213462cb4	Change CPU from float64 to int	2015-09-23 11:14:32 -07:00
Armon Dadgar	c5cf345df7	nomad: fixing tests	2015-09-15 11:12:46 -07:00
Armon Dadgar	cbc9b6dae2	nomad: thread alloc fit failure reason through	2015-09-13 18:38:11 -07:00
Armon Dadgar	1884296ff8	nomad: remove PortsOvercommited in favor of NetworkIndex	2015-09-13 14:56:51 -07:00
Armon Dadgar	293e44474b	nomad: adding helper structs	2015-09-07 15:08:50 -07:00
Armon Dadgar	42f9d4c1b6	nomad: plan supports more than just evict	2015-08-25 16:52:56 -07:00
Armon Dadgar	a30295cef0	nomad: splitting alloc desired and client status	2015-08-25 16:18:37 -07:00
Armon Dadgar	5b2dc385ec	nomad: adding evict state for allocs	2015-08-22 18:27:51 -07:00

1 2

54 Commits