Commit graph

85 commits

Author SHA1 Message Date
Lang Martin 7d483f93c0
csi: plugins track jobs in addition to allocations, and use job information to set expected counts (#8699)
* nomad/structs/csi: add explicit job support
* nomad/state/state_store: capture job updates directly
* api/nodes: CSIInfo needs the AllocID
* command/agent/csi_endpoint: AllocID was missing
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2020-08-27 17:20:00 -04:00
James Rasell edce647da1
api: add node purge SDK function. 2020-08-14 08:40:03 +01:00
Lang Martin 99841222ed csi: change the API paths to match CLI command layout (#7325)
* command/agent/csi_endpoint: support type filter in volumes & plugins

* command/agent/http: use /v1/volume/csi & /v1/plugin/csi

* api/csi: use /v1/volume/csi & /v1/plugin/csi

* api/nodes: use /v1/volume/csi & /v1/plugin/csi

* api/nodes: not /volumes/csi, just /volumes

* command/agent/csi_endpoint: fix ot parameter parsing
2020-03-23 13:58:30 -04:00
Lang Martin 80619137ab csi: volumes listed in nomad node status (#7318)
* api/allocations: GetTaskGroup finds the taskgroup struct

* command/node_status: display CSI volume names

* nomad/state/state_store: new CSIVolumesByNodeID

* nomad/state/iterator: new SliceIterator type implements memdb.ResultIterator

* nomad/csi_endpoint: deal with a slice of volumes

* nomad/state/state_store: CSIVolumesByNodeID return a SliceIterator

* nomad/structs/csi: CSIVolumeListRequest takes a NodeID

* nomad/csi_endpoint: use the return iterator

* command/agent/csi_endpoint: parse query params for CSIVolumes.List

* api/nodes: new CSIVolumes to list volumes by node

* command/node_status: use the new list endpoint to print volumes

* nomad/state/state_store: error messages consider the operator

* command/node_status: include the Provider
2020-03-23 13:58:30 -04:00
Danielle Lancashire 78b7784f2b api: Include CSI metadata on nodes 2020-03-23 13:58:29 -04:00
James Rasell 8a5acf7fd5
Merge pull request #5970 from jrasell/bug-gh-5506
Fix returned EOF error when calling Nodes GC/GcAlloc API
2020-03-12 10:04:17 +01:00
Michael Schurter ac3db90497 api: fix panic when displaying devices w/o stat
"<none>" mathces `node status -verbose` output
2020-02-26 21:24:31 -05:00
James Rasell e7eb49fe84 api: check response content length before decoding.
The API decodeBody function will now check the content length
before attempting to decode. If the length is zero, and the out
interface is nil then it is safe to assume the API call is not
returning any data to the user. This allows us to better handle
passing nil to API calls in a single place.
2020-02-20 10:07:44 +01:00
Luiz Aoqui 5bd7cdd5c3
api: add StartedAt in Node.DrainStrategy 2019-11-13 17:54:40 -05:00
Chris Raborg 763735d449 Update MonitorDrain comment to indicate channel is closed on errors (#6671)
Fixes #6645
2019-11-11 14:15:17 -05:00
James Rasell 4ee23df7ae Remove trailing dot on drain message to ensure better consistency. (#5956) 2019-11-05 16:53:38 -05:00
Danielle Lancashire 2e5f28029f
remove hidden field from host volumes
We're not shipping support for "hidden" volumes in 0.10 any more, I'll
convert this to an issue+mini RFC for future enhancement.
2019-08-22 08:48:05 +02:00
Danielle Lancashire 112b986736
api: Fix definition of HostVolumeInfo 2019-08-21 22:34:41 +02:00
Danielle Lancashire 6caac09743
api: Add HostVolumeInfo to response parsing 2019-08-12 15:39:09 +02:00
Mahmood Ali 7bdd43f3e0 api: avoid codegen for syncing
Given that the values will rarely change, specially considering that any
changes would be backward incompatible change.  As such, it's simpler to
keep syncing manually in the rare occasion and avoid the syncing code
overhead.
2019-01-18 18:52:31 -05:00
Mahmood Ali 7897dec9d9 api: move formatFloat function
`helpers.FormatFloat` function is only used in `api`.  Moving it and
marking it as private.  We can re-export it if we find value later.
2019-01-18 15:31:31 -05:00
Mahmood Ali 253532ec00 api: avoid import nomad/structs pkg
nomad/structs is an internal package and imports many libraries (e.g.
raft, codec) that are not relevant to api clients, and may cause
unnecessary dependency pain (e.g. `github.com/ugorji/go/codec`
version is very old now).

Here, we add a code generator that imports the relevant constants from
`nomad/structs`.

I considered using this approach for other structs, but didn't find a
quick viable way to reduce duplication.  `nomad/structs` use values as
struct fields (e.g. `string`), while `api` uses value pointer (e.g.
`*string`) instead.  Also, sometimes, `api` structs contain deprecated
fields or additional documentation, so simple copy-paste doesn't work.
For these reasons, I opt to keep the status quo.
2019-01-18 14:51:19 -05:00
Michael Schurter 8a6b1acaa6 drain: fix node drain monitoring
The whole approach to monitoring drains has ordering issues and lacks
state to output useful error messages.

AFAICT to get the tests passing reliably I needed to change the behavior
of monitoring.

Parts of these tests are skipped in CI, and they should be rewritten as
e2e tests.
2019-01-08 09:35:16 -08:00
Mahmood Ali fa9b9028a5 Use max 3 precision in displaying floats
When formating floats in `nomad node status`, use a maximum precision of
3.
2018-12-10 12:18:24 -05:00
Mahmood Ali 14668f48d1 device attributes in nomad node status -verbose
This reports device attributes like the following:

```
$ nomad node status -self -verbose
ID          = f7adb958-29e1-2a5a-2303-9d61ffaab33a
Name        = mars.local
Class       = <none>
DC          = dc1
Drain       = false
Eligibility = eligible
Status      = ready
Uptime      = 12h40m13s

Drivers
Driver       Detected  Healthy  Message                               Time
docker       true      true     healthy                               2018-12-10T11:47:19-05:00
...

Attributes
cpu.arch                      = amd64
cpu.frequency                 = 2200
cpu.modelname                 = Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
cpu.numcores                  = 12
...

Device Group Attributes
Device Group = nomad/file/mock
block_device = sda1
filesystem   = ext4
size         = 63.2 GB

Meta
```
2018-12-10 12:18:24 -05:00
Mahmood Ali 93e8fc53f9 device stats summary in node status
Sample output with a mock device:

```
Host Resource Utilization
CPU             Memory          Disk
2651/26400 MHz  9.6 GiB/16 GiB  98 GiB/234 GiB

Device Resource Utilization
nomad/file/mock[README.md]    511 bytes
nomad/file/mock[e2e.go]       239 bytes
nomad/file/mock[e2e_test.go]  128 bytes

Allocations
No allocations placed
```
2018-11-14 22:13:23 -05:00
Mahmood Ali 7d3e34fab5 Add NodeResource Device types in api package 2018-11-14 14:42:36 -05:00
Mahmood Ali 63acda956c Add Client Device Stats structs in api package 2018-11-14 14:41:19 -05:00
Alex Dadgar c176a83388 rename api TotalShares -> CpuShares 2018-10-16 17:25:55 -07:00
Alex Dadgar a78cefec18 use int64 2018-10-16 15:34:32 -07:00
Preetha Appan 7c0d8c646c
Change CPU/Disk/MemoryMB to int everywhere in new resource structs 2018-10-16 16:21:42 -05:00
Alex Dadgar 5c8697667e Node reserved resources 2018-09-29 18:44:55 -07:00
Alex Dadgar 3183153315 Node resources on client 2018-09-29 17:23:41 -07:00
Alex Dadgar 72effb8632 code review 2018-06-06 14:52:26 -07:00
Alex Dadgar f4fccd7ed2 Monitoring non-draining node exits 2018-06-05 17:58:44 -07:00
Preetha Appan 4f835790d7
Set node eligibility to true when old client calls disable 2018-05-30 16:54:07 -05:00
Nick Ethier 4b64db3a0f
api: emit different monitor message if node's drain strategy is never set 2018-05-24 06:39:09 -04:00
Chelsea Holland Komlo d51611040f Add driver health information to node list stub 2018-05-09 11:21:54 -04:00
James Rasell b7c2ce2991
Update node-drain logging message to clearer for operators.
This change updates the console log message when performing a node
drain and particulary when a node has marked all allocs for
migration. Previously it logged 'drain complete' which was a little
confusing to operators as the node is not drained at this point.

Closes #4183
2018-04-24 07:50:01 +01:00
Michael Schurter 7046db8818 cli: remove unreachable drain message 2018-03-30 14:15:12 -07:00
Michael Schurter f912cd4272 cli: log if a node goes down during draining 2018-03-30 14:02:42 -07:00
Michael Schurter 06874d8b3d drain: fix monitor node index handling 2018-03-30 12:43:53 -07:00
Michael Schurter 7199a2b960 cli: differentiate normal output vs info 2018-03-30 11:42:11 -07:00
Michael Schurter 0260bda046 cli: add color to drain output 2018-03-30 11:15:12 -07:00
Michael Schurter 6e10e0f84e drain: fix cli blocking when allocs already stopped 2018-03-30 10:18:14 -07:00
Michael Schurter ee3eddbac3 drain: block cli until all allocs stop
Before the drain CLI would block until the node was marked as completing
drain operations. While technically correct, it could lead operators (or
more likely: scripts) to shutdown drained nodes before all of its
allocations had *actually* terminated.

This change makes the CLI block until all allocations have terminated
(unless ignoring system jobs).
2018-03-29 10:56:09 -07:00
Alex Dadgar de4b3772f1 Create evals for system jobs when drain is unset
This PR creates evals for system jobs when:

* Drain is unset and mark eligible is true
* Eligibility is restored to the node
2018-03-27 15:53:24 -07:00
Chelsea Holland Komlo 003bc209b9 use time.Time for node events for compatibility 2018-03-27 15:43:57 -04:00
Michael Schurter a854c7bdae docs: improve DrainRequest.MarkEligible comment 2018-03-21 16:55:22 -07:00
Alex Dadgar 92b636dd32 Fix deadline handling 2018-03-21 16:51:44 -07:00
Alex Dadgar 7b2bad8c5e Toggle Drain allows resetting eligibility
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar a37329189a Improve DeadlineTime helper 2018-03-21 16:51:44 -07:00
Alex Dadgar d47c68f764 Add eligibility to node view 2018-03-21 16:51:44 -07:00
Alex Dadgar 8289cc3c6f HTTP and API 2018-03-21 16:51:44 -07:00
Alex Dadgar 010228577e Drain cli, api, http 2018-03-21 16:51:43 -07:00