Commit graph

14646 commits

Author SHA1 Message Date
Mahmood Ali 902eed4bf9 clarify cryptic log line 2019-04-19 09:31:43 -04:00
Mahmood Ali f74d60439f client: log detected driver health state
Noticed that `detected drivers` log line was misleading - when a driver
doesn't fingerprint before timeout, their health status is empty string
`""` which we would mark as detected.

Now, we log all drivers along with their state to ease driver
fingerprint debugging.
2019-04-19 09:15:25 -04:00
Mahmood Ali 6bdc9860b7 client: avoid registering node twice right away
I noticed that `watchNodeUpdates()` almost immediately after
`registerAndHeartbeat()` calls `retryRegisterNode()`, well after 5
seconds.

This call is unnecessary and made debugging a bit harder.  So here, we
ensure that we only re-register node for new node events, not for
initial registration.
2019-04-19 09:12:50 -04:00
Mahmood Ali f82ea8824f client: wait for batched driver updated
Here we retain 0.8.7 behavior of waiting for driver fingerprints before
registering a node, with some timeout.  This is needed for system jobs,
as system job scheduling for node occur at node registration, and the
race might mean that a system job may not get placed on the node because
of missing drivers.

The timeout isn't strictly necessary, but raising it to 1 minute as it's
closer to indefinitely blocked than 1 second.  We need to keep the value
high enough to capture as much drivers/devices, but low enough that
doesn't risk blocking too long due to misbehaving plugin.

Fixes https://github.com/hashicorp/nomad/issues/5579
2019-04-19 09:00:24 -04:00
Preetha 4fdd82c601
Merge pull request #5580 from hashicorp/f-api-preemption-info
Add preemption related fields to AllocationListStub
2019-04-18 18:38:25 -07:00
Preetha Appan 22109d1e20
Add preemption related fields to AllocationListStub 2019-04-18 10:36:44 -05:00
Danielle 72862db778
Merge pull request #5572 from hashicorp/dani/b-docker-volumes
Switch to pre-0.9 behaviour for handling volumes
2019-04-18 15:48:23 +02:00
Danielle be7daaaf15
Merge pull request #5573 from hashicorp/dani/update-vol-docs
docs: Clarify docker volume behaviour
2019-04-18 14:30:16 +02:00
Danielle Lancashire a096a7f112 Switch to pre-0.9 behaviour for handling volumes
In Nomad 0.9, we made volume driver handling the same for `""`, and
`"local"` volumes. Prior to Nomad 0.9 however these had slightly different
behaviour for relative paths and named volumes.

Prior to 0.9 the empty string would expand relative paths within the task
dir, and `"local"` volumes that are not absolute paths would be treated
as docker named volumes.

This commit reverts to the previous behaviour as follows:

| Nomad Version | Driver  |   Volume Spec    | Behaviour                 |
|-------------------------------------------------------------------------
| all           | ""      | testing:/testing | allocdir/testing          |
| 0.8.7         | "local" | testing:/testing | "testing" as named volume |
| 0.9.0         | "local" | testing:/testing | allocdir/testing          |
| 0.9.1         | "local" | testing:/testing | "testing" as named volume |
2019-04-18 14:28:45 +02:00
Chris Baker 338d4e989d
Merge pull request #5559 from ArangoGutierrez/website_docs_singularity
list singularity as a community driver
2019-04-17 12:42:29 -04:00
Charlie Voiselle 7f01244ece
fixed header level 2019-04-17 10:12:43 -04:00
Danielle Lancashire 1e0d3ffe24 docs: Clairfy docker volume behaviour 2019-04-17 11:31:55 +02:00
Mahmood Ali 12a9896a7e
Merge pull request #5568 from hashicorp/b-nomad-logger-restart
Fixes #5566 .

Fix a case where docker logging process may lock up nomad agent restart.

Looks like we have a case where docker logger is started even through logmon isn't. In such case, the fifo writer blocks indefinitely and because the open operation happens in the main goroutine, nomad agent blocks indefinitely.

This fixes the issue where the fifo open operation happens in goroutine instead of main goroutine.

We should follow up independently to ensure logmon <-> dockerlogger ordering and consider having task recovery happen in non-main goroutine with some sensible timeouts.
2019-04-16 19:34:37 -04:00
Eduardo Arango 40d0af5422
resolve merge conflicts
Signed-off-by: Eduardo Arango <eduardo@sylabs.io>
2019-04-16 17:01:22 -05:00
Eduardo Arango 6934b98313
address @cgbaker comments
Signed-off-by: Eduardo Arango <eduardo@sylabs.io>
2019-04-16 16:59:59 -05:00
Michael Schurter 3ba39e7c76
Merge pull request #5479 from hashicorp/b-vault-renewal
vault: fix renewal time
2019-04-16 12:20:26 -07:00
Michael Schurter 6421c55384 changelog: add #5479 2019-04-16 11:23:28 -07:00
Michael Schurter a85e7b7cc9 vault: fix data races 2019-04-16 11:22:44 -07:00
Michael Schurter 0aeb3dbd86 vault: fix renewal time
Renewal time was being calculated as 10s+Intn(lease-10s), so the renewal
time could be very rapid or within 1s of the deadline: [10s, lease)

This commit fixes the renewal time by calculating it as:

	(lease/2) +/- 10s

For a lease of 60s this means the renewal will occur in [20s, 40s).
2019-04-16 11:22:44 -07:00
Mahmood Ali 01a13a0947 locking and opening streams in goroutine comment 2019-04-16 11:02:19 -04:00
Mahmood Ali 357b86adc3 open fifo on background goroutine 2019-04-15 21:20:09 -04:00
Michael Schurter f7a7acc345
Merge pull request #5518 from hashicorp/f-simplify-kill
client: simplify kill logic
2019-04-15 14:11:58 -07:00
Michael Schurter 373748a327
Merge pull request #5486 from hashicorp/b-validate-migrate
api: fix migrate stanza initialization
2019-04-15 09:44:59 -07:00
Danielle a34b950a89
Merge pull request #5565 from hashicorp/dani/alloc-restart-docs
docs: Add docs for nomad-alloc-restart
2019-04-15 17:26:28 +02:00
Danielle Lancashire 3aef4343ae docs: Add docs for nomad-alloc-restart 2019-04-15 17:21:06 +02:00
Chris Baker a73d7e797b
Update singularity.html.md 2019-04-15 09:49:30 -04:00
Chris Baker 5b66a00689
Merge pull request #5560 from hashicorp/f-3251-cli-force-periodic
cli: add support for periodic force evaluation
2019-04-15 09:40:35 -04:00
Danielle Lancashire 60d7fc4bf5 Update CHANGELOG
Add `nomad alloc restart` and `nomad status -verbose`
2019-04-15 11:14:51 +02:00
Eduardo Arango c9bae637f2
Merge branch 'website_docs_singularity' of github.com:ArangoGutierrez/nomad into website_docs_singularity 2019-04-12 16:27:33 -05:00
Eduardo Arango 7ada6a2c4c
address requestec changes, iteration 1
Signed-off-by: Eduardo Arango <eduardo@sylabs.io>
2019-04-12 16:26:52 -05:00
Chris Baker 3b9237de4a gofmt/goimport and test formatting 2019-04-12 20:55:55 +00:00
Chris Baker eca8a3d537 changes to appease gofmt 2019-04-12 19:12:42 +00:00
Chris Baker 32f02793cf
minor typographical changes 2019-04-12 15:05:56 -04:00
Chris Baker b52d1c9274 cli: add support for periodic force evaluation
resolves #3251
2019-04-12 18:56:35 +00:00
Michael Lange abccee5d98
Merge pull request #5558 from hashicorp/b-ui-make-tests-faster
UI: Make tests faster
2019-04-12 11:41:03 -07:00
Chris Baker 3ed7783c66
Merge pull request #5556 from hashicorp/nmd-1403-vault-namespace-task-env
vault namespaces: inject VAULT_NAMESPACE alongside VAULT_TOKEN
2019-04-12 14:21:47 -04:00
Eduardo Arango 6f234382c4
list singularity as a community driver
Signed-off-by: Eduardo Arango <eduardo@sylabs.io>
2019-04-12 12:59:31 -05:00
Preetha 3261d0b460
Merge pull request #5545 from hashicorp/f-preemption-scheduler-refactor
Refactor scheduler package to enable preemption for batch/service jobs
2019-04-12 12:37:59 -05:00
Chris Baker 87bc0ca0f6
Merge pull request #5557 from hashicorp/nmd-1409-cli-acl-token-list
cli: add `acl token list` command, documentation
2019-04-12 12:56:55 -04:00
Chris Baker 5a43f10aaf cli: add acl token list command, documentation
docs: fix some incorrect acl policy docs (typos, copy-paste errors)
2019-04-12 15:48:36 +00:00
Preetha Appan bcb5c8c70d
remove stray new line 2019-04-12 10:32:48 -05:00
Chris Baker 6848591914 vault namespaces: inject VAULT_NAMESPACE alongside VAULT_TOKEN + documentation 2019-04-12 15:06:34 +00:00
Michael Lange 9358713560 Speed up slow acceptance tests with shallow jobs and generally less models 2019-04-11 20:08:43 -07:00
Michael Lange 6988dc1b5c Introduce the concept of 'shallow' job models in Mirage 2019-04-11 20:08:09 -07:00
Michael Lange 243adeb165 Reduce the number of task groups and task events that are made 2019-04-11 18:15:35 -07:00
Nick Fagerlund a25364fca2
Merge pull request #5505 from nfagerlund/mar19_middleman_update
website: Update hashicorp-middleman container to v0.3.39
2019-04-11 16:36:49 -07:00
Michael Schurter 5e8e59eefb api: fix migrate stanza initialization
Fixes Migrate to be initialized like RescheduleStrategy.

Fixes #5477
2019-04-11 15:29:19 -07:00
Lang Martin 77920684c0
Merge pull request #5551 from hashicorp/b-revert-fingerprinter-manual-config
Revert accidental merge of pr #5482
2019-04-11 11:55:21 -04:00
Lang Martin a2a1e7829d Revert accidental merge of pr #5482
Revert "fingerprint Constraints and Affinities have Equals, as set"
This reverts commit 596f16fb5f1a4a6766a57b3311af806d22382609.

Revert "client tests assert the independent handling of interface and speed"
This reverts commit 7857ac5993a578474d0570819f99b7b6e027de40.

Revert "structs missed applying a style change from the review"
This reverts commit 658916e3274efa438beadc2535f47109d0c2f0f2.

Revert "client, structs comments"
This reverts commit be2838d6baa9d382a5013fa80ea016856f28ade2.

Revert "client fingerprint updateNetworks preserves the network configuration"
This reverts commit fc309cb430e62d8e66267a724f006ae9abe1c63c.

Revert "client_test cleanup comments from review"
This reverts commit bc0bf4efb9114e699bc662f50c8f12319b6b3445.

Revert "client Networks Equals is set equality"
This reverts commit f8d432345b54b1953a4a4c719b9269f845e3e573.

Revert "struct cleanup indentation in RequestedDevice Equals"
This reverts commit f4746411cab328215def6508955b160a53452da3.

Revert "struct Equals checks for identity before value checking"
This reverts commit 0767a4665ed30ab8d9586a59a74db75d51fd9226.

Revert "fix client-test, avoid hardwired platform dependecy on lo0"
This reverts commit e89dbb2ab182b6368507dbcd33c3342223eb0ae7.

Revert "refactor error in client fingerprint to include the offending data"
This reverts commit a7fed726c6e0264d42a58410d840adde780a30f5.

Revert "add client updateNodeResources to merge but preserve manual config"
This reverts commit 84bd433c7e1d030193e054ec23474380ff3b9032.

Revert "refactor struts.RequestedDevice to have its own Equals"
This reverts commit 689782524090e51183474516715aa2f34908b8e6.

Revert "refactor structs.Resource.Networks to have its own Equals"
This reverts commit 49e2e6c77bb3eaa4577772b36c62205061c92fa1.

Revert "refactor structs.Resource.Devices to have its own Equals"
This reverts commit 4ede9226bb971ae42cc203560ed0029897aec2c9.

Revert "add COMPAT(0.10): Remove in 0.10 notes to impl for structs.Resources"
This reverts commit 49fbaace5298d5ccf031eb7ebec93906e1d468b5.

Revert "add structs.Resources Equals"
This reverts commit 8528a2a2a6450e4462a1d02741571b5efcb45f0b.

Revert "test that fingerprint resources are updated, net not clobbered"
This reverts commit 8ee02ddd23bafc87b9fce52b60c6026335bb722d.
2019-04-11 10:29:40 -04:00
Chris Baker 14bfd8869b
Merge pull request #5550 from hashicorp/cgbaker/update-terraform
More terraform updates
2019-04-11 10:18:28 -04:00