Commit graph

995 commits

Author SHA1 Message Date
Chris Baker d43e0d10c0 appease the linter and fix an incorrect test 2021-01-08 19:38:25 +00:00
Mahmood Ali 050ad6b6f4
tests: deflake test-api job (#9742)
Deflake test-api job, currently failing at around 7.6% (44 out of 578
workflows), by ensuring that test nomad agent use a small dedicated port
range that doesn't conflict with the kernel ephemeral range.

The failures are disproportionatly related to port allocation, where a
nomad agent fails to start when the http port is already bound to
another process. The failures are intermitent and aren't specific to any
test in particular. The following is a representative failure:
https://app.circleci.com/pipelines/github/hashicorp/nomad/13995/workflows/6cf6eb38-f93c-46f8-8aa0-f61e62fe7694/jobs/128169
.

Upon investigation, the issue seems to be that the api freeport library
picks a port block within 10,000-14,500, but that overlaps with the
kernel ephemeral range 32,769-60,999! So, freeport may allocate a free
port to the nomad agent, just to be used by another process before the
nomad agent starts!

This happened for example in
https://app.circleci.com/pipelines/github/hashicorp/nomad/14111/workflows/e1fcd7ff-f0e0-4796-8719-f57f510b1ffa/jobs/129684
.  `freeport` allocated port 41662 to serf, but `google_accounts`
raced to use it to connect to the CirleCI vm metadata service.

We avoid such races by using a dedicated port range that's disjoint from
the kernel ephemeral port range.
2021-01-06 16:18:28 -05:00
Conor Mongey f54ef3843d
Revert "Headers -> Header"
This reverts commit 71396fa721945e55f51bc90ed02522936450209b.
2021-01-06 17:12:22 +00:00
Conor Mongey fed7e534a2
Only override headers if they're set 2021-01-06 17:12:21 +00:00
Conor Mongey b8ab91da00
Ensure set headers have lower precedence than basic auth headers 2021-01-06 17:12:21 +00:00
Conor Mongey d72c50f2c4
Headers -> Header 2021-01-06 17:12:21 +00:00
Conor Mongey a4f424389b
Prefer http.Header over map[string]string to allow for multi-valued headers 2021-01-06 17:12:20 +00:00
Conor Mongey e5bb65c62e
Allow setting of headers in api client 2021-01-06 17:03:58 +00:00
Seth Hoenig 9ec1af5310 command: remove use of flag impls from consul
In a few places Nomad was using flag implementations directly
from Consul, lending to Nomad's need to import consul. Replace
those uses with helpers already in Nomad, and copy over the bare
minimum needed to make the autopilot flags behave as they have.
2020-12-11 07:58:20 -06:00
Drew Bailey 17de8ebcb1
API: Event stream use full name instead of Eval/Alloc (#9509)
* use full name for events

use evaluation and allocation instead of short name

* update api event stream package and shortnames

* update docs

* make sync; fix typo

* backwards compat not from 1.0.0-beta event stream api changes

* use api types instead of string

* rm backwards compat note that only changed between prereleases

* remove backwards incompat that only existed in prereleases
2020-12-03 11:48:18 -05:00
Drew Bailey a0b7f05a7b
Remove Managed Sinks from Nomad (#9470)
* Remove Managed Sinks from Nomad

Managed Sinks were a beta feature in Nomad 1.0-beta2. During the beta
period it was determined that this was not a scalable approach to
support community and third party sinks.

* update comment

* changelog
2020-11-30 14:00:31 -05:00
Seth Hoenig e81e9223ef consul/connect: enable setting datacenter in connect upstream
Before, upstreams could only be defined using the default datacenter.
Now, the `datacenter` field can be set in a connect upstream definition,
informing consul of the desire for an instance of the upstream service
in the specified datacenter. The field is optional and continues to
default to the local datacenter.

Closes #8964
2020-11-30 10:38:30 -06:00
Tim Gross 4e79ddea45
csi/api: populate ReadAllocs/WriteAllocs fields (#9377)
The API is missing values for `ReadAllocs` and `WriteAllocs` fields, resulting
in allocation claims not being populated in the web UI. These fields mirror
the fields in `nomad/structs.CSIVolume`. Returning a separate list of stubs
for read and write would be ideal, but this can't be done without either
bloating the API response with repeated full `Allocation` data, or causing a
panic in previous versions of the CLI.

The `nomad/structs` fields are persisted with nil values and are populated
during RPC, so we'll do the same in the HTTP API and populate the `ReadAllocs`
and `WriteAllocs` fields with a map of allocation IDs, but with null
values. The web UI will then create its `ReadAllocations` and
`WriteAllocations` fields by mapping from those IDs to the values in
`Allocations`, instead of flattening the map into a list.
2020-11-25 16:44:06 -05:00
Seth Hoenig 3c17dc2ecc api: safely access legacy MBits field 2020-11-23 10:36:10 -06:00
Nick Ethier 3483f540b6 api: don't break public API 2020-11-23 10:36:10 -06:00
Nick Ethier f1ea79f5a8 remove references to default mbits 2020-11-23 10:32:13 -06:00
Chris Baker a659091fd9 api: Event().Stream() should use the index parameter 2020-11-21 16:49:52 +00:00
Seth Hoenig 4cc3c01d5b
Merge pull request #9352 from hashicorp/f-artifact-headers
jobspec: add support for headers in artifact stanza
2020-11-13 14:04:27 -06:00
Seth Hoenig bb8a5816a0 jobspec: add support for headers in artifact stanza
This PR adds the ability to set HTTP headers when downloading
an artifact from an `http` or `https` resource.

The implementation in `go-getter` is such that a new `HTTPGetter`
must be created for each artifact that sets headers (as opposed
to conveniently setting headers per-request). This PR maintains
the memoization of the default Getter objects, creating new ones
only for artifacts where headers are set.

Closes #9306
2020-11-13 12:03:54 -06:00
Jasmine Dahilig d6110cbed4
lifecycle: add poststop hook (#8194) 2020-11-12 08:01:42 -08:00
Chris Baker e3c0ea654d auto-complete for recommendations CLI, plus OSS components of recommendations prefix search 2020-11-11 11:13:43 +00:00
Kris Hicks ecde165d79
api/testutil: Use fast fingerprint timeout in dev mode tests (#9285) 2020-11-06 07:01:27 -08:00
Drew Bailey cb689a8382
Api/event stream payload values (#9277)
* Get concrete types out of dynamic payload

wip

pull out value setting to func

* Add TestEventSTream_SetPayloadValue

Add more assertions

use alias type in unmarshalJSON to handle payload rawmessage

shorten unmarshal and remove anonymous wrap struct

* use map structure and helper functions to return concrete types

* ensure times are properly handled

* update test name

* put all decode logic in a single function

Co-authored-by: Kris Hicks <khicks@hashicorp.com>
2020-11-05 13:04:18 -05:00
Kris Hicks 1da9e7fc67
Add event sink API and CLI commands (#9226)
Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>
2020-11-02 09:57:35 -08:00
Chris Baker 719077a26d added new policy capabilities for recommendations API
state store: call-out to generic update of job recommendations from job update method
recommendations API work, and http endpoint errors for OSS
support for scaling polices in task block of job spec
add query filters for ScalingPolicy list endpoint
command: nomad scaling policy list: added -job and -type
2020-10-28 14:32:16 +00:00
Mahmood Ali d3a17b5c82 address review feedback 2020-10-22 11:49:37 -04:00
Mahmood Ali 9c0a93a604 Don't parse the server-set fields of the job struct 2020-10-22 08:18:57 -04:00
Mahmood Ali f52bda4c30 api: update /render api to parse hclv2 2020-10-21 15:46:57 -04:00
Mahmood Ali 618388d1c3 api: parse service gateway name
Adding gateway name eases HCLv2 parsing. This field is only used for parsing the
job and is ignored for any other pruposes
2020-10-21 14:05:46 -04:00
Mahmood Ali 58df967c3a Tag Job spec with HCLv2 tags 2020-10-21 14:05:46 -04:00
Michael Schurter dd09fa1a4a
Merge pull request #9055 from hashicorp/f-9017-resources
api: add field filters to /v1/{allocations,nodes}
2020-10-14 14:49:39 -07:00
Dave May f37e90be18
Metrics gotemplate support, debug bundle features (#9067)
* add goroutine text profiles to nomad operator debug

* add server-id=all to nomad operator debug

* fix bug from changing metrics from string to []byte

* Add function to return MetricsSummary struct, metrics gotemplate support

* fix bug resolving 'server-id=all' when no servers are available

* add url to operator_debug tests

* removed test section which is used for future operator_debug.go changes

* separate metrics from operator, use only structs from go-metrics

* ensure parent directories are created as needed

* add suggested comments for text debug pprof

* move check down to where it is used

* add WaitForFiles helper function to wait for multiple files to exist

* compact metrics check

Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>

* fix github's silly apply suggestion

Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>
2020-10-14 15:16:10 -04:00
Drew Bailey c463479848
filter on additional filter keys, remove switch statement duplication
properly wire up durable event count

move newline responsibility

moves newline creation from NDJson to the http handler, json stream only encodes and sends now

ignore snapshot restore if broker is disabled

enable dev mode to access event steam without acl

use mapping instead of switch

use pointers for config sizes, remove unused ttl, simplify closed conn logic
2020-10-14 14:14:33 -04:00
Michael Schurter 8ccbd92cb6 api: add field filters to /v1/{allocations,nodes}
Fixes #9017

The ?resources=true query parameter includes resources in the object
stub listings. Specifically:

- For `/v1/nodes?resources=true` both the `NodeResources` and
  `ReservedResources` field are included.
- For `/v1/allocations?resources=true` the `AllocatedResources` field is
  included.

The ?task_states=false query parameter removes TaskStates from
/v1/allocations responses. (By default TaskStates are included.)
2020-10-14 10:35:22 -07:00
Drew Bailey df96b89958
Add EvictCallbackFn to handle removing entries from go-memdb when they
are removed from the event buffer.

Wire up event buffer size config, use pointers for structs.Events
instead of copying.
2020-10-14 12:44:42 -04:00
Drew Bailey 315f77a301
rehydrate event publisher on snapshot restore
address pr feedback
2020-10-14 12:44:41 -04:00
Drew Bailey 9a2b1bd63a
api comments 2020-10-14 12:44:38 -04:00
Drew Bailey 400455d302
Events/eval alloc events (#9012)
* generic eval update event

first pass at alloc client update events

* api/event client
2020-10-14 12:44:37 -04:00
Dave May 561637c063
Merge pull request #9034 from hashicorp/dmay-debug-metrics
Add metrics command / output to debug bundle
2020-10-06 11:47:09 -04:00
davemay99 67b4161411 added comment to operator metrics function 2020-10-06 11:22:10 -04:00
davemay99 18aa30c00f metrics return bytes instead of string for more flexibility 2020-10-06 10:49:15 -04:00
davemay99 603cc1776c Add metrics command / output to debug bundle 2020-10-05 22:30:01 -04:00
Chris Baker 5c68f53c24 updated api tests wrt backwards compat on null chars in IDs 2020-10-05 18:01:50 +00:00
Michael Schurter 765473e8b0 jobspec: lower min cpu resources from 10->1
Since CPU resources are usually a soft limit it is desirable to allow
setting it as low as possible to allow tasks to run only in "idle" time.

Setting it to 0 is still not allowed to avoid potential unintentional
side effects with allowing a zero value. While there may not be any side
effects this commit attempts to minimize risk by avoiding the issue.

This does *not* change the defaults.
2020-09-30 12:15:13 -07:00
Luiz Aoqui 88d4eecfd0
add scaling policy type 2020-09-29 17:57:46 -04:00
Mahmood Ali f5700611c0
api: target servers for allocation requests (#8897)
Allocation requests should target servers, which then can forward the
request to the appropriate clients.

Contacting clients directly is fragile and prune to failures: e.g.
clients maybe firewalled and not accessible from the API client, or have
some internal certificates not trusted by the API client.

FWIW, in contexts where we anticipate lots of traffic (e.g. logs, or
exec), the api package attempts contacting the client directly but then
fallsback to using the server. This approach seems excessive in these
simple GET/PUT requests.

Fixes #8894
2020-09-16 09:34:17 -04:00
Benjamin Buzbee 4c91f20c44
Add API support for cancelation contexts passed via QueryOptions and WriteOptions (#8836)
Copy Consul API's format: QueryOptions.WithContext(context) will now return
a new QueryOption whose HTTP requests will be canceled with the context
provided (and similar for WriteOptions)
2020-09-09 16:22:07 -04:00
Jasmine Dahilig 71a694f39c
Merge pull request #8390 from hashicorp/lifecycle-poststart-hook
task lifecycle poststart hook
2020-08-31 13:53:24 -07:00
Tim Gross b77fe023b5
MRD: move 'job stop -global' handling into RPC (#8776)
The initial implementation of global job stop for MRD looped over all the
regions in the CLI for expedience. This changeset includes the OSS parts of
moving this into the RPC layer so that API consumers don't have to implement
this logic themselves.
2020-08-28 14:28:13 -04:00
Lang Martin 7d483f93c0
csi: plugins track jobs in addition to allocations, and use job information to set expected counts (#8699)
* nomad/structs/csi: add explicit job support
* nomad/state/state_store: capture job updates directly
* api/nodes: CSIInfo needs the AllocID
* command/agent/csi_endpoint: AllocID was missing
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2020-08-27 17:20:00 -04:00
Tim Gross 4eba6be8b1
tests: allow API test server to respect NOMAD_TEST_LOG_LEVEL (#8760) 2020-08-27 14:36:17 -04:00
Seth Hoenig c4fa644315 consul/connect: remove envoy dns option from gateway proxy config 2020-08-24 09:11:55 -05:00
Seth Hoenig 5b072029f2 consul/connect: add initial support for ingress gateways
This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza.

```hcl
service {
  connect {
    gateway {
      proxy {
        // envoy proxy configuration
      }
      ingress {
        // ingress-gateway configuration entry
      }
    }
  }
}
```

A gateway can be run in `bridge` or `host` networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value.

Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver.

Aims to address #8294 and tangentially #8647
2020-08-21 16:21:54 -05:00
James Rasell b6a884a7cb
Merge pull request #8636 from hashicorp/f-gh-8142
api: add node purge SDK function.
2020-08-17 09:45:54 +02:00
James Rasell edce647da1
api: add node purge SDK function. 2020-08-14 08:40:03 +01:00
Lang Martin c82b2a2454
CSI: volume and plugin allocations in the API (#8590)
* command/agent/csi_endpoint: explicitly convert to API structs, and convert allocs for single object get endpoints
2020-08-11 12:24:41 -04:00
Tim Gross 443fdaa86b
csi: nomad volume detach command (#8584)
The soundness guarantees of the CSI specification leave a little to be desired
in our ability to provide a 100% reliable automated solution for managing
volumes. This changeset provides a new command to bridge this gap by providing
the operator the ability to intervene.

The command doesn't take an allocation ID so that the operator doesn't have to
keep track of alloc IDs that may have been GC'd. Handle this case in the
unpublish RPC by sending the client RPC for all the terminal/nil allocs on the
selected node.
2020-08-11 10:18:54 -04:00
Seth Hoenig fd4804bf26 consul: able to set pass/fail thresholds on consul service checks
This change adds the ability to set the fields `success_before_passing` and
`failures_before_critical` on Consul service check definitions. This is a
feature added to Consul v1.7.0 and later.
  https://www.consul.io/docs/agent/checks#success-failures-before-passing-critical

Nomad doesn't do much besides pass the fields through to Consul.

Fixes #6913
2020-08-10 14:08:09 -05:00
Drew Bailey bd421b6197
Merge pull request #8453 from hashicorp/oss-multi-vault-ns
oss compoments for multi-vault namespaces
2020-07-27 08:45:22 -04:00
Drew Bailey b296558b8e
oss compoments for multi-vault namespaces
adds in oss components to support enterprise multi-vault namespace feature

upgrade specific doc on vault multi-namespaces

vault docs

update test to reflect new error
2020-07-24 10:14:59 -04:00
James Rasell da91e1d0fc
api: add namespace to scaling status GET response object. 2020-07-24 11:19:25 +02:00
Mahmood Ali ec9a12e54e Fix pro tags 2020-07-17 11:02:00 -04:00
Jasmine Dahilig 9e27231953 add poststart hook to task hook coordinator & structs 2020-07-08 11:01:35 -07:00
Tim Gross b6d9e546aa
fix volume deregister -force params in API (#8380)
The CSI `volume deregister -force` flag was using the documented parameter
`force` everywhere except in the API, where it was incorrectly passing `purge`
as the query parameter.
2020-07-08 10:08:22 -04:00
Chris Baker bdbb066ec8 fixed api tests for changes 2020-07-04 19:23:58 +00:00
Chris Baker 9100b6b7c0 changes to make sure that Max is present and valid, to improve error messages
* made api.Scaling.Max a pointer, so we can detect (and complain) when it is neglected
* added checks to HCL parsing that it is present
* when Scaling.Max is absent/invalid, don't return extraneous error messages during validation
* tweak to multiregion handling to ensure that the count is valid on the interpolated regional jobs

resolves #8355
2020-07-04 19:05:50 +00:00
Lang Martin 6c22cd587d
api: nomad debug new /agent/host (#8325)
* command/agent/host: collect host data, multi platform

* nomad/structs/structs: new HostDataRequest/Response

* client/agent_endpoint: add RPC endpoint

* command/agent/agent_endpoint: add Host

* api/agent: add the Host endpoint

* nomad/client_agent_endpoint: add Agent Host with forwarding

* nomad/client_agent_endpoint: use findClientConn

This changes forwardMonitorClient and forwardProfileClient to use
findClientConn, which was cribbed from the common parts of those
funcs.

* command/debug: call agent hosts

* command/agent/host: eliminate calling external programs
2020-07-02 09:51:25 -04:00
Tim Gross 23be116da0
csi: add -force flag to volume deregister (#8295)
The `nomad volume deregister` command currently returns an error if the volume
has any claims, but in cases where the claims can't be dropped because of
plugin errors, providing a `-force` flag gives the operator an escape hatch.

If the volume has no allocations or if they are all terminal, this flag
deletes the volume from the state store, immediately and implicitly dropping
all claims without further CSI RPCs. Note that this will not also
unmount/detach the volume, which we'll make the responsibility of a separate
`nomad volume detach` command.
2020-07-01 12:17:51 -04:00
Nick Ethier 89118016fc
command: correctly show host IP in ports output /w multi-host networks (#8289) 2020-06-25 15:16:01 -04:00
Seth Hoenig 6c5ab7f45e consul/connect: split connect native flag and task in service 2020-06-23 10:22:22 -05:00
Seth Hoenig 4d71f22a11 consul/connect: add support for running connect native tasks
This PR adds the capability of running Connect Native Tasks on Nomad,
particularly when TLS and ACLs are enabled on Consul.

The `connect` stanza now includes a `native` parameter, which can be
set to the name of task that backs the Connect Native Consul service.

There is a new Client configuration parameter for the `consul` stanza
called `share_ssl`. Like `allow_unauthenticated` the default value is
true, but recommended to be disabled in production environments. When
enabled, the Nomad Client's Consul TLS information is shared with
Connect Native tasks through the normal Consul environment variables.
This does NOT include auth or token information.

If Consul ACLs are enabled, Service Identity Tokens are automatically
and injected into the Connect Native task through the CONSUL_HTTP_TOKEN
environment variable.

Any of the automatically set environment variables can be overridden by
the Connect Native task using the `env` stanza.

Fixes #6083
2020-06-22 14:07:44 -05:00
Michael Schurter 562704124d
Merge pull request #8208 from hashicorp/f-multi-network
multi-interface network support
2020-06-19 15:46:48 -07:00
Nick Ethier a87e91e971
test: fix up testing around host networks 2020-06-19 13:53:31 -04:00
Drew Bailey c2d7b61939
allow raw body instead of JSON encoded string (#8211) 2020-06-19 10:57:09 -04:00
Nick Ethier f0559a8162
multi-interface network support 2020-06-19 09:42:10 -04:00
Tim Gross 8a354f828f
store ACL Accessor ID from Job.Register with Job (#8204)
In multiregion deployments when ACLs are enabled, the deploymentwatcher needs
an appropriately scoped ACL token with the same `submit-job` rights as the
user who submitted it. The token will already be replicated, so store the
accessor ID so that it can be retrieved by the leader.
2020-06-19 07:53:29 -04:00
Mahmood Ali 38a01c050e
Merge pull request #8192 from hashicorp/f-status-allnamespaces-2
CLI Allow querying all namespaces for jobs and allocations - Try 2
2020-06-18 20:16:52 -04:00
Nick Ethier 0bc0403cc3 Task DNS Options (#7661)
Co-Authored-By: Tim Gross <tgross@hashicorp.com>
Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>
2020-06-18 11:01:31 -07:00
Mahmood Ali 7a33a75449 cli: jobs allow querying jobs in all namespaces 2020-06-17 16:31:01 -04:00
Mahmood Ali e784fe331a use '*' to indicate all namespaces
This reverts the introduction of AllNamespaces parameter that was merged
earlier but never got released.
2020-06-17 16:27:43 -04:00
Tim Gross 7b12445f29 multiregion: change AutoRevert to OnFailure 2020-06-17 11:05:45 -04:00
Tim Gross b09b7a2475 Multiregion job registration
Integration points for multiregion jobs to be registered in the enterprise
version of Nomad:
* hook in `Job.Register` for enterprise to send job to peer regions
* remove monitoring from `nomad job run` and `nomad job stop` for multiregion jobs
2020-06-17 11:04:58 -04:00
Tim Gross b93efc16d5 multiregion CLI: nomad deployment unblock 2020-06-17 11:03:44 -04:00
Drew Bailey 9263fcb0d3 Multiregion deploy status and job status CLI 2020-06-17 11:03:34 -04:00
Tim Gross 6851024925 Multiregion structs
Initial struct definitions, jobspec parsing, validation, and conversion
between Nomad structs and API structs for multi-region deployments.
2020-06-17 11:00:14 -04:00
Chris Baker 9fc66bc1aa support in API client and Job.Register RPC for PreserveCounts 2020-06-16 18:45:28 +00:00
Chris Baker 377f881fbd removed api.RegisterJobRequest in favor of api.JobRegisterRequest
modified `job inspect` and `job run -output` to use anonymous struct to keep previous behavior
2020-06-16 18:45:17 +00:00
Chris Baker b22b2bf968 wip: developmental test to preserve existing task group counts during job update 2020-06-16 18:45:17 +00:00
Chris Baker aeb3ed449e wip: added .PreviousCount to api.ScalingEvent and structs.ScalingEvent, with developmental tests 2020-06-15 19:40:21 +00:00
Mahmood Ali 63e048e972 clarify ccomments, esp related to leadership code 2020-06-09 12:01:31 -04:00
Mahmood Ali 5cf04b5762 api: add snapshot restore 2020-06-07 15:47:07 -04:00
Mahmood Ali 3bb2e4a447
Merge pull request #8083 from hashicorp/test-deflake-20200531
More Test deflaking - 2020-05-31 edition
2020-06-01 09:28:45 -04:00
Mahmood Ali de44d9641b
Merge pull request #8047 from hashicorp/f-snapshot-save
API for atomic snapshot backups
2020-06-01 07:55:16 -04:00
Mahmood Ali a73cd01a00
Merge pull request #8001 from hashicorp/f-jobs-list-across-nses
endpoint to expose all jobs across all namespaces
2020-05-31 21:28:03 -04:00
Mahmood Ali b102ee164b tests: terminate agent gracefully
Ensure that api test agent is terminated gracefully. This is desired for
two purposes:

First, to ensure that the logs are fully flished out.  If the agent is
killed mid log line and go test doesn't emit a new line before `---
PASS:` indicator, the test may be marked as failed, even if it passed.
Sample failure is https://circleci.com/gh/hashicorp/nomad/72360 .

Second, ensure that the agent terminates any auxiliary processes (e.g.
logmon, tasks).
2020-05-31 10:35:37 -04:00
Drew Bailey 5948c4f497
Revert "disable license cli commands" 2020-05-26 12:39:39 -04:00
Mahmood Ali b548f603dc Add api/ package function to save snapshot 2020-05-21 20:04:38 -04:00
Seth Hoenig f136afc04f api: canonicalize connect components
Add `Canonicalize` methods to the connect components of a service
definition in the `api` package. Without these, we have been relying
on good input for the connect stanza.

Fixes #7993
2020-05-19 11:47:22 -06:00
Mahmood Ali 63b87a9a89 update api/ JobListStub 2020-05-19 09:58:19 -04:00
Mahmood Ali 5ab2d52e27 endpoint to expose all jobs across all namespaces
Allow a `/v1/jobs?all_namespaces=true` to list all jobs across all
namespaces.  The returned list is to contain a `Namespace` field
indicating the job namespace.

If ACL is enabled, the request token needs to be a management token or
have `namespace:list-jobs` capability on all existing namespaces.
2020-05-18 13:50:46 -04:00
James Rasell 01236e842b
api: tidy Go module to remove unused modules. 2020-05-18 09:56:23 +02:00
Tim Gross 2082cf738a
csi: support for VolumeContext and VolumeParameters (#7957)
The MVP for CSI in the 0.11.0 release of Nomad did not include support
for opaque volume parameters or volume context. This changeset adds
support for both.

This also moves args for ControllerValidateCapabilities into a struct.
The CSI plugin `ControllerValidateCapabilities` struct that we turn
into a CSI RPC is accumulating arguments, so moving it into a request
struct will reduce the churn of this internal API, make the plugin
code more readable, and make this method consistent with the other
plugin methods in that package.
2020-05-15 08:16:01 -04:00
Chris Baker 11d8fb4b16 the api.ScalingEvent struct was missing the .Count field 2020-05-13 20:44:53 +00:00
Lang Martin d3c4700cd3
server: stop after client disconnect (#7939)
* jobspec, api: add stop_after_client_disconnect

* nomad/state/state_store: error message typo

* structs: alloc methods to support stop_after_client_disconnect

1. a global AllocStates to track status changes with timestamps. We
   need this to track the time at which the alloc became lost
   originally.

2. ShouldClientStop() and WaitClientStop() to actually do the math

* scheduler/reconcile_util: delayByStopAfterClientDisconnect

* scheduler/reconcile: use delayByStopAfterClientDisconnect

* scheduler/util: updateNonTerminalAllocsToLost comments

This was setup to only update allocs to lost if the DesiredStatus had
already been set by the scheduler. It seems like the intention was to
update the status from any non-terminal state, and not all lost allocs
have been marked stop or evict by now

* scheduler/testing: AssertEvalStatus just use require

* scheduler/generic_sched: don't create a blocked eval if delayed

* scheduler/generic_sched_test: several scheduling cases
2020-05-13 16:39:04 -04:00
Mahmood Ali 3b4116e0db
Merge pull request #7894 from hashicorp/b-cronexpr-dst-fix
Fix Daylight saving transition handling
2020-05-12 16:36:11 -04:00
Mahmood Ali 326793939e vendor: use tagged cronexpr, v1.1.0
Also, update to the version with modification notice
2020-05-12 16:20:00 -04:00
Chris Baker fce0e81d8c
Merge pull request #7928 from hashicorp/license-drop-tags
remove tags from api struct
2020-05-12 08:40:02 -05:00
Tim Gross 4374c1a837
csi: support Secrets parameter in CSI RPCs (#7923)
CSI plugins can require credentials for some publishing and
unpublishing workflow RPCs. Secrets are configured at the time of
volume registration, stored in the volume struct, and then passed
around as an opaque map by Nomad to the plugins.
2020-05-11 17:12:51 -04:00
Drew Bailey 64b58775ab
remove tags from api struct 2020-05-11 16:38:35 -04:00
Drew Bailey 466e8d5043
disable license cli commands 2020-05-11 13:49:29 -04:00
Mahmood Ali 57435950d7 Update current DST and some code style issues 2020-05-07 19:27:05 -04:00
Mahmood Ali c8fb132956 Update cronexpr to point to hashicorp/cronexpr 2020-05-07 17:50:45 -04:00
Michael Lange 1a9631dbfa Add ControllersExpected to the PluginListStub 2020-05-07 10:01:52 -07:00
Drew Bailey 48c451709e
update license command output to reflect api changes 2020-05-05 10:28:58 -04:00
Mahmood Ali b9e3cde865 tests and some clean up 2020-05-01 13:13:30 -04:00
Charlie Voiselle 663fb677cf Add SchedulerAlgorithm to SchedulerConfig 2020-05-01 13:13:29 -04:00
Drew Bailey acacecc67b
add license reset command to commands
help text formatting

remove reset

no signed option
2020-04-30 14:46:20 -04:00
Drew Bailey 74abe6ef48
license cli commands
cli changes, formatting
2020-04-30 14:46:17 -04:00
Yoan Blanc 417c2995c9
api: fix some documentation typos
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-04-27 10:25:29 +02:00
Yoan Blanc 790df29996
api: testify v1.5.1
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-04-11 13:55:10 +02:00
Tim Gross 966286fee5
fix encoding/decoding tags for api.Task (#7620)
When `nomad job inspect` encodes the response, if the decoded JSON
from the API doesn't exactly match the API struct, the field value
will be omitted even if it has a value. We only want the JSON struct
tag to `omitempty`.
2020-04-03 16:45:49 -04:00
Chris Baker 8ec252e627 added indices to the job scaling events, so we could properly do
blocking queries on the job scaling status
2020-04-01 17:28:19 +00:00
Chris Baker b2ab42afbb scaling api: more testing around the scaling events api 2020-04-01 16:39:23 +00:00
Chris Baker 40d6b3bbd1 adding raft and state_store support to track job scaling events
updated ScalingEvent API to record "message string,error bool" instead
of confusing "reason,error *string"
2020-04-01 16:15:14 +00:00
Seth Hoenig 14c7cebdea connect: enable automatic expose paths for individual group service checks
Part of #6120

Building on the support for enabling connect proxy paths in #7323, this change
adds the ability to configure the 'service.check.expose' flag on group-level
service check definitions for services that are connect-enabled. This is a slight
deviation from the "magic" that Consul provides. With Consul, the 'expose' flag
exists on the connect.proxy stanza, which will then auto-generate expose paths
for every HTTP and gRPC service check associated with that connect-enabled
service.

A first attempt at providing similar magic for Nomad's Consul Connect integration
followed that pattern exactly, as seen in #7396. However, on reviewing the PR
we realized having the `expose` flag on the proxy stanza inseperably ties together
the automatic path generation with every HTTP/gRPC defined on the service. This
makes sense in Consul's context, because a service definition is reasonably
associated with a single "task". With Nomad's group level service definitions
however, there is a reasonable expectation that a service definition is more
abstractly representative of multiple services within the task group. In this
case, one would want to define checks of that service which concretely make HTTP
or gRPC requests to different underlying tasks. Such a model is not possible
with the course `proxy.expose` flag.

Instead, we now have the flag made available within the check definitions themselves.
By making the expose feature resolute to each check, it is possible to have
some HTTP/gRPC checks which make use of the envoy exposed paths, as well as
some HTTP/gRPC checks which make use of some orthongonal port-mapping to do
checks on some other task (or even some other bound port of the same task)
within the task group.

Given this example,

group "server-group" {
  network {
    mode = "bridge"
    port "forchecks" {
      to = -1
    }
  }

  service {
    name = "myserver"
    port = 2000

    connect {
      sidecar_service {
      }
    }

    check {
      name     = "mycheck-myserver"
      type     = "http"
      port     = "forchecks"
      interval = "3s"
      timeout  = "2s"
      method   = "GET"
      path     = "/classic/responder/health"
      expose   = true
    }
  }
}

Nomad will automatically inject (via job endpoint mutator) the
extrapolated expose path configuration, i.e.

expose {
  path {
    path            = "/classic/responder/health"
    protocol        = "http"
    local_path_port = 2000
    listener_port   = "forchecks"
  }
}

Documentation is coming in #7440 (needs updating, doing next)

Modifications to the `countdash` examples in https://github.com/hashicorp/demo-consul-101/pull/6
which will make the examples in the documentation actually runnable.

Will add some e2e tests based on the above when it becomes available.
2020-03-31 17:15:50 -06:00
Seth Hoenig 41244c5857 jobspec: parse multi expose.path instead of explicit slice 2020-03-31 17:15:27 -06:00
Seth Hoenig 0266f056b8 connect: enable proxy.passthrough configuration
Enable configuration of HTTP and gRPC endpoints which should be exposed by
the Connect sidecar proxy. This changeset is the first "non-magical" pass
that lays the groundwork for enabling Consul service checks for tasks
running in a network namespace because they are Connect-enabled. The changes
here provide for full configuration of the

  connect {
    sidecar_service {
      proxy {
        expose {
          paths = [{
		path = <exposed endpoint>
                protocol = <http or grpc>
                local_path_port = <local endpoint port>
                listener_port = <inbound mesh port>
	  }, ... ]
       }
    }
  }

stanza. Everything from `expose` and below is new, and partially implements
the precedent set by Consul:
  https://www.consul.io/docs/connect/registration/service-registration.html#expose-paths-configuration-reference

Combined with a task-group level network port-mapping in the form:

  port "exposeExample" { to = -1 }

it is now possible to "punch a hole" through the network namespace
to a specific HTTP or gRPC path, with the anticipated use case of creating
Consul checks on Connect enabled services.

A future PR may introduce more automagic behavior, where we can do things like

1) auto-fill the 'expose.path.local_path_port' with the default value of the
   'service.port' value for task-group level connect-enabled services.

2) automatically generate a port-mapping

3) enable an 'expose.checks' flag which automatically creates exposed endpoints
   for every compatible consul service check (http/grpc checks on connect
   enabled services).
2020-03-31 17:15:27 -06:00
Tim Gross f849d2cb5e
api: prevent panic if volume has nil allocs (#7486) 2020-03-25 09:45:51 -04:00
Mahmood Ali ceed57b48f per-task restart policy 2020-03-24 17:00:41 -04:00
Chris Baker 5979d6a81e more testing for ScalingPolicy, mainly around parsing and canonicalization for Min/Max 2020-03-24 19:43:50 +00:00
Chris Baker aa5beafe64 Job.Scale should not result in job update or eval create if args.Count == nil
plus tests
2020-03-24 17:36:06 +00:00
Chris Baker bc13bfb433 bad conversion between api.ScalingPolicy and structs.ScalingPolicy meant
that we were throwing away .Min if provided
2020-03-24 14:39:06 +00:00
Chris Baker ab4d174319 added new int64ToPtr method to api/util to avoid pulling in other packages 2020-03-24 14:39:05 +00:00
Chris Baker f6ec5f9624 made count optional during job scaling actions
added ACL protection in Job.Scale
in Job.Scale, only perform a Job.Register if the Count was non-nil
2020-03-24 14:39:05 +00:00
Chris Baker 233db5258a changes to Canonicalize, Validate, and api->struct conversion so that tg.Count, tg.Scaling.Min/Max are well-defined with reasonable defaults.
- tg.Count defaults to tg.Scaling.Min if present (falls back on previous default of 1 if Scaling is absent)
- Validate() enforces tg.Scaling.Min <= tg.Count <= tg.Scaling.Max

modification in ApiScalingPolicyToStructs, api.TaskGroup.Validate so that defaults are handled for TaskGroup.Count and
2020-03-24 13:57:17 +00:00
Chris Baker 5373d503c7 scaling api: put api.* objects in agreement with structs.* objects 2020-03-24 13:57:16 +00:00
Chris Baker 00092a6c29 fixed http endpoints for job.register and job.scalestatus 2020-03-24 13:57:16 +00:00
Chris Baker 925b59e1d2 wip: scaling status return, almost done 2020-03-24 13:57:15 +00:00
Chris Baker 42270d862c wip: some tests still failing
updating job scaling endpoints to match RFC, cleaning up the API object as well
2020-03-24 13:57:14 +00:00
Chris Baker abc7a52f56 finished refactoring state store, schema, etc 2020-03-24 13:57:14 +00:00
Luiz Aoqui 47d35489d6 wip: use testify in job scaling tests 2020-03-24 13:57:13 +00:00
Luiz Aoqui d4b6e4b258 wip: add tests for job scale method 2020-03-24 13:57:12 +00:00
Luiz Aoqui c74c01a643 wip: add scaling policies methods to the client 2020-03-24 13:57:12 +00:00
Chris Baker 3d54f1feba wip: added Enabled to ScalingPolicyListStub, removed JobID from body of scaling request 2020-03-24 13:57:12 +00:00
Chris Baker 84953a1ed7 wip: remove PolicyOverride from scaling request 2020-03-24 13:57:11 +00:00
Chris Baker 024d203267 wip: added tests for client methods around group scaling 2020-03-24 13:57:11 +00:00
Chris Baker 1c5c2eb71b wip: add GET endpoint for job group scaling target 2020-03-24 13:57:10 +00:00
Luiz Aoqui 6699d5e536 wip: add job scale endpoint in client 2020-03-24 13:57:10 +00:00
Chris Baker 8453e667c2 wip: working on job group scaling endpoint 2020-03-24 13:55:20 +00:00
Chris Baker 65d92f1fbf WIP: adding ScalingPolicy to api/structs and state store 2020-03-24 13:55:18 +00:00