Commit Graph

3010 Commits

Author SHA1 Message Date
Tim Gross 75878f978e HTTP API support for 'nomad ui -login'
Endpoints for requesting and exchanging one-time tokens via the HTTP
API. Includes documentation updates.
2021-03-10 08:17:56 -05:00
James Rasell 782350bd19
agent: return req error if prometheus metrics are disabled.
If the user has disabled Prometheus metrics and a request is
sent to the metrics endpoint requesting Prometheus formatted
metrics, then the request should fail.
2021-03-09 15:28:58 +01:00
Dave May e93b49a119
debug: update defaults to commonly used values 2021-03-09 08:31:38 -05:00
Andre Ilhicas f45fc6c899
consul/connect: enable setting local_bind_address in upstream 2021-02-26 11:37:31 +00:00
Dave May cd506cb887
Handle Consul API URL protocol mismatch (#10082) 2021-02-25 08:22:44 -05:00
Drew Bailey 86d9e1ff90
Merge pull request #9955 from hashicorp/on-update-services
Service and Check on_update configuration option (readiness checks)
2021-02-24 10:11:05 -05:00
Kris Hicks 12069efc3e
cli: Update namespace helptext guide URL (#10074)
The old URL redirects to https://learn.hashicorp.com/nomad
2021-02-23 08:31:19 -08:00
Seth Hoenig d557d6bf94 consul/connect: Fix bug where connect sidecar services would be unnecessarily re-registered
This PR fixes a bug where sidecar services would be re-registered into Consul every ~30
seconds, caused by the parent service having its tags field set and the sidecar_service
tags unset. Nomad would directly compare the tags between its copy of the sidecar service
definition and the tags of the sidecar service reported by Consul. This does not work,
because Consul will under-the-hood set the sidecar service tags to inherit the parent
service tags if the sidecar service tags are unset. The comparison then done by Nomad
would not match, if the parent sidecar tags are set.

Fixes #10025
2021-02-22 12:02:58 -06:00
Karan Sharma e853345e8e
feat: Add -task to alloc logs command (#10026) 2021-02-16 11:21:06 -05:00
AndrewChubatiuk 157f8de903 pr comments 2021-02-13 02:42:14 +02:00
AndrewChubatiuk cb84d8ef11 fixed typo 2021-02-13 02:42:14 +02:00
AndrewChubatiuk 4f49614760 fixed typo 2021-02-13 02:42:14 +02:00
AndrewChubatiuk c4bece2b09 fixed typo 2021-02-13 02:42:14 +02:00
AndrewChubatiuk cb527c5d83 fixed tests 2021-02-13 02:42:14 +02:00
AndrewChubatiuk abb7fd1e36 fixed connect port label 2021-02-13 02:42:14 +02:00
AndrewChubatiuk 3e11860b6d fixed connect port label 2021-02-13 02:42:14 +02:00
AndrewChubatiuk cd152643fb fixed connect port label 2021-02-13 02:42:14 +02:00
AndrewChubatiuk af7a16d774 tests fix 2021-02-13 02:42:14 +02:00
AndrewChubatiuk 3d0aa2ef56 allocate sidecar task port on host_network interface 2021-02-13 02:42:13 +02:00
AndrewChubatiuk 47c690479c fixed tests 2021-02-13 02:42:13 +02:00
AndrewChubatiuk 46c639c5d6 fixed tests 2021-02-13 02:42:13 +02:00
AndrewChubatiuk 77f9b84a30 fixed tests 2021-02-13 02:42:13 +02:00
AndrewChubatiuk 99201412da removed proxy suffix 2021-02-13 02:42:13 +02:00
AndrewChubatiuk 844ac16900 fixed variable initialization 2021-02-13 02:42:13 +02:00
AndrewChubatiuk 78465bbd23 customized default sidecar checks 2021-02-13 02:42:13 +02:00
Buck Doyle c22d1114d8
Add handling for license requests in OSS (#9963)
This changes the license-fetching endpoint to respond with 204 in
OSS instead of 501. It closes #9827.
2021-02-08 12:53:06 -06:00
Drew Bailey 82f971f289
OnUpdate configuration for services and checks
Allow for readiness type checks by configuring nomad to ignore warnings
or errors reported by a service check. This allows the deployment to
progress and while Consul handles introducing the sercive into a
resource pool once the check passes.
2021-02-08 08:32:40 -05:00
Seth Hoenig 4bc6e5a215 drivers/exec+java: Add configuration to restore previous PID/IPC namespace behavior.
This PR adds default_pid_mode and default_ipc_mode options to the exec and java
task drivers. By default these will default to "private" mode, enabling PID and
IPC isolation for tasks. Setting them to "host" mode disables isolation. Doing
so is not recommended, but may be necessary to support legacy job configurations.

Closes #9969
2021-02-05 15:52:11 -06:00
Chris Baker 84f1902296 scaling policy -verbose flag, plus testing and other recommendations from review 2021-02-04 15:08:13 +00:00
Chris Baker 452b738db1 updated "scaling policy info" with prefix search and auto-complete 2021-02-03 21:29:44 +00:00
Chris Baker 02bef2df0a bad boolean logic for List-on-Info commands 2021-02-03 21:29:05 +00:00
Mahmood Ali 8dedef31d6 tests: update test with a real invalid hcl2 2021-02-01 11:12:22 -05:00
Chris Baker ce68ee164b Version 1.0.3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJgEuOKAAoJEFGFLYc0j/xMxF8H/3TTU6Tu+Xm0YvcsDaYDphZ/
 X7KQBV0aFiuL5VkTw4PzKEsgryIy9/sqEPyxxyKRowAmos9qhiusjNAIfqdP4TF8
 tdZmTedkfWir9uPD+hyv/LXpwbQ2T8kTwS3xHTYvaOmaCxZr710FEn+imnMk1AUn
 Xs5itkd/CYGr0nBLm+I5GutWSDPmL7Uw8J5Z30fFyoaxoCPAbCWQQNk793SCRUc5
 f/uo18V2tFInmQ+3sAdnM4gPewyStK/a5VvzWavL9fVDtYK83wlqWSchTXY5jpVz
 zNEzt/rYhbBzakPQQKb5zieblh2iGI8aHWpD5w4WduqO2Sg6B/5lAeNZIlW0UJg=
 =2g3c
 -----END PGP SIGNATURE-----

Merge tag 'v1.0.3' into post-release-1.0.3

Version 1.0.3
2021-01-29 19:30:08 +00:00
Seth Hoenig 8b05efcf88 consul/connect: Add support for Connect terminating gateways
This PR implements Nomad built-in support for running Consul Connect
terminating gateways. Such a gateway can be used by services running
inside the service mesh to access "legacy" services running outside
the service mesh while still making use of Consul's service identity
based networking and ACL policies.

https://www.consul.io/docs/connect/gateways/terminating-gateway

These gateways are declared as part of a task group level service
definition within the connect stanza.

service {
  connect {
    gateway {
      proxy {
        // envoy proxy configuration
      }
      terminating {
        // terminating-gateway configuration entry
      }
    }
  }
}

Currently Envoy is the only supported gateway implementation in
Consul. The gateay task can be customized by configuring the
connect.sidecar_task block.

When the gateway.terminating field is set, Nomad will write/update
the Configuration Entry into Consul on job submission. Because CEs
are global in scope and there may be more than one Nomad cluster
communicating with Consul, there is an assumption that any terminating
gateway defined in Nomad for a particular service will be the same
among Nomad clusters.

Gateways require Consul 1.8.0+, checked by a node constraint.

Closes #9445
2021-01-25 10:36:04 -06:00
Drew Bailey 630babb886
prevent double job status update (#9768)
* Prevent Job Statuses from being calculated twice

https://github.com/hashicorp/nomad/pull/8435 introduced atomic eval
insertion iwth job (de-)registration. This change removes a now obsolete
guard which checked if the index was equal to the job.CreateIndex, which
would empty the status. Now that the job regisration eval insetion is
atomic with the registration this check is no longer necessary to set
the job statuses correctly.

* test to ensure only single job event for job register

* periodic e2e

* separate job update summary step

* fix updatejobstability to use copy instead of modified reference of job

* update envoygatewaybindaddresses copy to prevent job diff on null vs empty

* set ConsulGatewayBindAddress to empty map instead of nil

fix nil assertions for empty map

rm unnecessary guard
2021-01-22 09:18:17 -05:00
Dennis Schön 3eaf1432aa
validate connect block allowed only within group.service 2021-01-20 14:34:23 -05:00
Dennis Schön c376545f49
don't prefix json logging 2021-01-20 09:09:31 -05:00
Kris Hicks d71a90c8a4
Fix some errcheck errors (#9811)
* Throw away result of multierror.Append

When given a *multierror.Error, it is mutated, therefore the return
value is not needed.

* Simplify MergeMultierrorWarnings, use StringBuilder

* Hash.Write() never returns an error

* Remove error that was always nil

* Remove error from Resources.Add signature

When this was originally written it could return an error, but that was
refactored away, and callers of it as of today never handle the error.

* Throw away results of io.Copy during Bridge

* Handle errors when computing node class in test
2021-01-14 12:46:35 -08:00
Nomad Release bot f36d983863 Generate files for 1.0.2 release 2021-01-13 16:52:51 +00:00
Dave May 35d43c19ab
nomad agent-info: Add json/gotemplate formatting (#9788)
* nomad agent-info: Add json/gotemplate formatting
* Add CHANGELOG entry
* update docs
2021-01-13 09:42:46 -05:00
Seth Hoenig 36da162619 command: generate bindata assetfs 2021-01-10 17:09:08 -06:00
Seth Hoenig 456868c166 command: bump connect examples to v3
Nomad v1.0+ combined with Consul 1.9+ support launching Envoy v1.16+
which is the first version of envoy to support arm64 platforms out
of the box.

By rebuilding our example docker containers for connect to be multiplatform
between amd64 and arm64, Nomad can provide a nicer user experience for
those trying out Connect on arm64 machines (e.g. AWS Graviton instances
or Raspberry Pi 4's).

This has been done for the countdash examples at v3.

https://hub.docker.com/layers/hashicorpnomad/counter-dashboard/v3/images/sha256-94e323587bc372ba1b6ca5c58dc23e291e9d26787b50e71025f1c8967dfbcd07?context=repo
https://hub.docker.com/layers/hashicorpnomad/counter-api/v3/images/sha256-16a9e9e08082985a635c9edd0f258b084153c6c7831a9b41d34bde78c308b65c?context=repo

The connect-native examples are now also multiplatform at v5, but we
don't have them built into `job init`.
2021-01-10 16:54:31 -06:00
Nick Ethier 6705f845f2
Merge pull request #9739 from hashicorp/b-alloc-netmode-ports
Use port's to value when building service address under 'alloc' addr_mode
2021-01-07 09:16:27 -05:00
Nick Ethier bb060bd46b command/agent/consul: remove duplicated tests 2021-01-06 14:11:31 -05:00
Kris Hicks 868ba0cea5
consul: Refactor parts of UpdateWorkload (#9737)
This removes modification of ops in methods that UpdateWorkload calls, keeping
them local to UpdateWorkload. It also includes some rewrites of checkRegs for
clarity.
2021-01-06 11:11:28 -08:00
Nick Ethier ab01e19df3 command/agent/consul: use port's to value when building service address under 'alloc' addr_mode 2021-01-06 13:52:48 -05:00
Kris Hicks be6580bd28
Refactor alloc exec command (#9718)
This re-arranges the alloc exec Run implementation to have validation
hoisted as high as possible.
2021-01-05 09:33:04 -08:00
Seth Hoenig 6c9366986b consul/connect: avoid NPE from unset connect gateway proxy
Submitting a job with an ingress gateway in host networking mode
with an absent gateway.proxy block would cause the Nomad client
to panic on NPE.

The consul registration bits would assume the proxy stanza was
not nil, but it could be if the user does not supply any manually
configured envoy proxy settings.

Check the proxy field is not nil before using it.

Fixes #9669
2021-01-05 09:27:01 -06:00
Dave May 0dd2d8944f
Debug test refactor (#9637)
* debug: refactor test cases
* debug: remove unnecessary syncbuffer resets
* debug: cleaned up test code per suggestions
* debug: clarify note on parallel testing
2020-12-15 13:51:41 -05:00
Seth Hoenig f2cafa07f5
Merge pull request #9633 from hashicorp/b-undo-429-connlimit
agent: revert use of http connlimit
2020-12-15 11:02:14 -06:00
Seth Hoenig 40d36fc0ec agent: revert use of http connlimit
https://github.com/hashicorp/nomad/pull/9608 introduced the use of the
built-in HTTP 429 response handler provided by go-connlimit. There is
concern though around plausible DOS attacks that need to be addressed,
so this PR reverts that functionality.

It keeps a fix in the tests around the use of an HTTPS enabled client
for when the server is listening on HTTPS. Previously, the tests would
fail deterministically with io.EOF because that's how the TLS server
terminates invalid connections.

Now, the result is much less deterministic. The state of the client
connection and the server socket depends on when the connection is
closed and how far along the handshake was.
2020-12-14 14:40:14 -06:00
Dave May 5f50c1d0c1
debug: Fix node count bug from GH-9566 (#9625)
* debug: update test to identify bug in GH-9566
* debug: range tests need fresh cmd each iteration
* debug: fix node count bug in GH-9566
2020-12-14 15:02:48 -05:00
Seth Hoenig 288c838843
Merge pull request #9620 from hashicorp/f-remove-consul-flags
command: remove use of flag impls from consul
2020-12-14 10:33:35 -06:00
Seth Hoenig 0091325721 command: give flag-helpers a better name 2020-12-14 10:07:27 -06:00
Mahmood Ali 3d8fa799c7 actually pass env-vars to hcl2 parsing 2020-12-11 13:32:54 -05:00
Seth Hoenig 9ec1af5310 command: remove use of flag impls from consul
In a few places Nomad was using flag implementations directly
from Consul, lending to Nomad's need to import consul. Replace
those uses with helpers already in Nomad, and copy over the bare
minimum needed to make the autopilot flags behave as they have.
2020-12-11 07:58:20 -06:00
Seth Hoenig d3f1c3adcf
Merge pull request #9608 from hashicorp/f-go-connlimit
Use go-connlimit to ratelimit with 429 responses
2020-12-10 11:05:07 -06:00
Seth Hoenig a28cd45988 client: fix plumbing of testing object into helper 2020-12-10 11:04:38 -06:00
Kris Hicks 0cf9cae656
Apply some suggested fixes from staticcheck (#9598) 2020-12-10 07:29:18 -08:00
Seth Hoenig 2cc5787f97 client: fix https test cases in client rate limits 2020-12-10 09:20:28 -06:00
Ben Buzbee a8e4aa76c6 Use new go-connlimit with HTTP 429 response
This is essentially a port of Consul's similar fix
Changes are:
go get -u github.com/hashicorp/go-connlimit
go mod vendor
Use new HTTP429 handler

20d1ea7d2d
2020-12-09 17:57:16 -06:00
Mahmood Ali 9a3da14a57
actually honor var files (#9592)
Apparently, we missed passing VarFile argument, so var files were
ignored.

Fixes #9588
2020-12-09 14:12:48 -05:00
Kris Hicks 0a3a748053
Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
Kris Hicks 93155ba3da
Add gocritic to golangci-lint config (#9556) 2020-12-08 12:47:04 -08:00
Dennis Schön 038f2cce57
return 405 on non-GET requests to /v1/event/stream (fixes #9526) (#9564) 2020-12-08 13:09:20 -05:00
Dave May be0a14d70b
fix AgentHostRequest panic found in GH-9546 (#9554)
* debug: refactor nodeclass test
* debug: add case to track down SIGSEGV on client to server Agent.Host RPC
* verify server to avoid panic on AgentHostRequest RPC call, fixes GH-9546
* simplify Agent.Host RPC lookup logic
2020-12-07 17:34:40 -05:00
Dennis Schön a9c97d9257
use os.ErrDeadlineExceeded in tests 2020-12-07 10:40:28 -05:00
Dave May e045bd3a5e
nomad operator debug - add pprof duration / csi details (#9346)
* debug: add pprof duration CLI argument
* debug: add CSI plugin details
* update help text with ACL requirements
* debug: provide ACL hints upon permission failures
* debug: only write file when pprof retrieve is successful
* debug: add helper function to clean bad characters from dynamic filenames
* debug: ensure files are unable to escape the capture directory
2020-12-01 12:36:05 -05:00
Drew Bailey 9adca240f8
Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447)
* upsertaclpolicies

* delete acl policies msgtype

* upsert acl policies msgtype

* delete acl tokens msgtype

* acl bootstrap msgtype

wip unsubscribe on token delete

test that subscriptions are closed after an ACL token has been deleted

Start writing policyupdated test

* update test to use before/after policy

* add SubscribeWithACLCheck to run acl checks on subscribe

* update rpc endpoint to use broker acl check

* Add and use subscriptions.closeSubscriptionFunc

This fixes the issue of not being able to defer unlocking the mutex on
the event broker in the for loop.

handle acl policy updates

* rpc endpoint test for terminating acl change

* add comments

Co-authored-by: Kris Hicks <khicks@hashicorp.com>
2020-12-01 11:11:34 -05:00
Drew Bailey a0b7f05a7b
Remove Managed Sinks from Nomad (#9470)
* Remove Managed Sinks from Nomad

Managed Sinks were a beta feature in Nomad 1.0-beta2. During the beta
period it was determined that this was not a scalable approach to
support community and third party sinks.

* update comment

* changelog
2020-11-30 14:00:31 -05:00
Seth Hoenig fa6789a087
Merge pull request #9472 from hashicorp/f-connect-upstream-datacenter
consul/connect: enable setting datacenter in upstream
2020-11-30 12:29:05 -06:00
Seth Hoenig e81e9223ef consul/connect: enable setting datacenter in connect upstream
Before, upstreams could only be defined using the default datacenter.
Now, the `datacenter` field can be set in a connect upstream definition,
informing consul of the desire for an instance of the upstream service
in the specified datacenter. The field is optional and continues to
default to the local datacenter.

Closes #8964
2020-11-30 10:38:30 -06:00
Chris Piwarski 955bf23e8e command: Fix node help error 2020-11-30 11:12:06 -05:00
Tim Gross 4e79ddea45
csi/api: populate ReadAllocs/WriteAllocs fields (#9377)
The API is missing values for `ReadAllocs` and `WriteAllocs` fields, resulting
in allocation claims not being populated in the web UI. These fields mirror
the fields in `nomad/structs.CSIVolume`. Returning a separate list of stubs
for read and write would be ideal, but this can't be done without either
bloating the API response with repeated full `Allocation` data, or causing a
panic in previous versions of the CLI.

The `nomad/structs` fields are persisted with nil values and are populated
during RPC, so we'll do the same in the HTTP API and populate the `ReadAllocs`
and `WriteAllocs` fields with a map of allocation IDs, but with null
values. The web UI will then create its `ReadAllocations` and
`WriteAllocations` fields by mapping from those IDs to the values in
`Allocations`, instead of flattening the map into a list.
2020-11-25 16:44:06 -05:00
Seth Hoenig 3c17dc2ecc api: safely access legacy MBits field 2020-11-23 10:36:10 -06:00
Nick Ethier c9bd7e89ca command: use correct port mapping syntax in examples 2020-11-23 10:25:30 -06:00
Nick Ethier d21cbeb30f command: remove task network usage from init examples 2020-11-23 10:25:11 -06:00
Nick Ethier 9471892df4 mock: add default host network 2020-11-23 10:11:00 -06:00
Tim Gross 38120123c5
docs: add missing command documentation (#9415)
* `nomad operator keyring` was missing the general options section
* `nomad operator metrics` was missing a page in the docs entirely

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2020-11-23 08:10:10 -05:00
Tim Gross f1ad512986 docs: describe required ACLs for all commands 2020-11-20 13:38:29 -05:00
Tim Gross de6b023af2 command: remove -namespace from help options when not applicable 2020-11-19 16:28:39 -05:00
Tim Gross d67afa2e21 docs/help: -no-color does not apply to alloc logs content
The `nomad alloc logs` command does not remove terminal escape sequences for
color from the log outputs of a task. Clarify that the standard `-no-color`
flag, which does apply to Nomad's error responses from `nomad alloc logs`,
does not apply to the log output.
2020-11-19 15:29:12 -05:00
Chris Baker afe6fa82c2
Merge pull request #9368 from hashicorp/b-9367-duped-ports
fix duplication of ports in `AllocatedResources`
2020-11-16 11:41:34 -06:00
Tim Gross a788557579
csi: alloc status -verbose should query volume request 'source' (#9354)
The `nomad alloc status -verbose` command returns a 404 from CSI volumes
because the volume mount block in the task points back to the
`job.group.volume` block. So using the `Name` field to query is the "name" as
seen in the jobspec, and not the name of the volume that we need for querying.

Show both the job-specific name and the volume ID in the resulting output,
which clarifies the difference between the two fields and is more consistent
with the web UI.
2020-11-16 08:28:52 -05:00
Chris Baker 7c6071e6c4 updated alloc_endpoint to mutate a copy of the returned allocation, instead of the instance in the state store 2020-11-15 17:52:50 +00:00
Chris Baker b244d5e949 documenting test for #9367 2020-11-15 17:47:50 +00:00
Seth Hoenig 4cc3c01d5b
Merge pull request #9352 from hashicorp/f-artifact-headers
jobspec: add support for headers in artifact stanza
2020-11-13 14:04:27 -06:00
Seth Hoenig bb8a5816a0 jobspec: add support for headers in artifact stanza
This PR adds the ability to set HTTP headers when downloading
an artifact from an `http` or `https` resource.

The implementation in `go-getter` is such that a new `HTTPGetter`
must be created for each artifact that sets headers (as opposed
to conveniently setting headers per-request). This PR maintains
the memoization of the default Getter objects, creating new ones
only for artifacts where headers are set.

Closes #9306
2020-11-13 12:03:54 -06:00
Dave May e89302aa4b
nomad operator debug - add client node filtering arguments (#9331)
* operator debug - add client node filtering arguments

* add WaitForClient helper function

* use RPC in WaitForClient to avoid unnecessary imports

* guard against nil values

* move initialization up and shorten test duration

* cleanup nodeLookupFailCount logic

* only display max node notice if we actually tried to capture nodes
2020-11-12 11:25:28 -05:00
Seth Hoenig b19bc6be2b consul: prevent re-registration churn by correctly comparing sidecar tags
Previously, connect sidecars would be re-registered with consul every cycle
of Nomad's reconciliation loop around Consul service registrations. This is
because part of the comparison used `reflect.DeepEqual` on []string objects,
which returns false when one object is `[]string{}` and the other is `[]string{}(nil)`.

Unforunately, this was always the case, and every Connect sidecar service
would be re-registered on every iteration, which happens every 30 seconds.
2020-11-11 18:01:17 -06:00
Chris Baker 611502dd5f
Merge pull request #9327 from hashicorp/b-fix-rec-cli
recs cli: fixed bad composition
2020-11-11 15:54:14 -06:00
Nick Ethier 5e1634eda1
structs: canonicalize allocatedtaskresources to populate shared ports (#9309) 2020-11-11 16:21:47 -05:00
Chris Baker bc82312eb4 recs cli: fixed bad composition 2020-11-11 20:52:57 +00:00
Tim Gross 60874ebe25
csi: Postrun hook should not change mode (#9323)
The unpublish workflow requires that we know the mode (RW vs RO) if we want to
unpublish the node. Update the hook and the Unpublish RPC so that we mark the
claim for release in a new state but leave the mode alone. This fixes a bug
where RO claims were failing node unpublish.

The core job GC doesn't know the mode, but we don't need it for that workflow,
so add a mode specifically for GC; the volumewatcher uses this as a sentinel
to check whether claims (with their specific RW vs RO modes) need to be claimed.
2020-11-11 13:06:30 -05:00
Chris Baker 2deb77dcf5 cli: updated recommendation commands and test to remove duplication of autocompletion code 2020-11-11 11:39:26 +00:00
Chris Baker e3c0ea654d auto-complete for recommendations CLI, plus OSS components of recommendations prefix search 2020-11-11 11:13:43 +00:00
Mahmood Ali f61d284fe6 Custom message when job file is HCL2 incompatible
Use a custom message when the job file is a valid HCL1 but no longer
valid under HCL 2 syntax.
2020-11-10 14:56:20 -05:00
Mahmood Ali 69849a42a5
Merge pull request #9298 from hashicorp/f-hcl2-localsvars
HCL2: Variables and Locals
2020-11-09 16:44:37 -05:00
Luiz Aoqui ea81ac5d3d
Merge pull request #9296 from hashicorp/b-remove-namespace-from-scale-request
Remove Namespace field from JobScaleRequest
2020-11-09 15:13:33 -05:00
Mahmood Ali 7fcda0f01f Start accepting input variable files as CLI arguments 2020-11-09 15:02:58 -05:00
Mahmood Ali 1ae3e8a8eb Start using the new jobspec2 API 2020-11-09 15:01:31 -05:00
Luiz Aoqui c536286c7a
remove Namespace field from JobScaleRequest 2020-11-09 13:02:05 -05:00
James Rasell 473fb4bb17
Merge pull request #9272 from hashicorp/f-recommendation-cli
cli: add recommendation commands.
2020-11-06 16:29:57 +01:00
James Rasell fe92d8b3cb
cli: add recommendation commands.
Adds new CLI commands for applying, dismissing and detailing Nomad
recommendations under a new top level recommendation command.
2020-11-06 11:16:24 +01:00
Nick Ethier 04f5c4ee5f
ar/groupservice: remove drivernetwork (#9233)
* ar/groupservice: remove drivernetwork

* consul: allow host address_mode to accept raw port numbers

* consul: fix logic for blank address
2020-11-05 15:00:22 -05:00
Charlie Voiselle 0e373f70c1
Merge pull request #9255 from hashicorp/d-missing-example-comma
Add missing comma in help-text example.
2020-11-04 09:51:09 -05:00
James Rasell b147ec1e57
cli: update scaling policy list help to include job flag. 2020-11-04 13:35:29 +01:00
Charlie Voiselle 443a93be11
Add missing comma in help-text example.
@krishicks spotted this while playing with command.
2020-11-02 18:00:53 -05:00
Kris Hicks 48a260fc33
Update monitor func not to take a prefix (#9251)
The only user of monitor(evalID, true) was command/eval_status, and
eval_status had a duplicate of the prefix-handling code inside it, so in
all cases the complete evalID was being passed to monitor.

Given that, we can remove the prefix code from command/monitor, and
remove the boolean arg.
2020-11-02 10:24:49 -08:00
Kris Hicks 1da9e7fc67
Add event sink API and CLI commands (#9226)
Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>
2020-11-02 09:57:35 -08:00
Chris Baker 719077a26d added new policy capabilities for recommendations API
state store: call-out to generic update of job recommendations from job update method
recommendations API work, and http endpoint errors for OSS
support for scaling polices in task block of job spec
add query filters for ScalingPolicy list endpoint
command: nomad scaling policy list: added -job and -type
2020-10-28 14:32:16 +00:00
Drew Bailey 86080e25a9
Send events to EventSinks (#9171)
* Process to send events to configured sinks

This PR adds a SinkManager to a server which is responsible for managing
managed sinks. Managed sinks subscribe to the event broker and send
events to a sink writer (webhook). When changes to the eventstore are
made the sinkmanager and managed sink are responsible for reloading or
starting a new managed sink.

* periodically check in sink progress to raft

Save progress on the last successfully sent index to raft. This allows a
managed sink to resume close to where it left off in the event of a lost
server or leadership change

dereference eventsink so we can accurately use the watchch

When using a pointer to eventsink struct it was updated immediately and our reload logic would not trigger
2020-10-26 17:27:54 -04:00
Drew Bailey 1ae39a9ed9
event sink crud operation api (#9155)
* network sink rpc/api plumbing

state store methods and restore

upsert sink test

get sink

delete sink

event sink list and tests

go generate new msg types

validate sink on upsert

* go generate
2020-10-23 14:23:00 -04:00
Michael Schurter c2dd9bc996 core: open source namespaces 2020-10-22 15:26:32 -07:00
Mahmood Ali 059e87c862
Merge pull request #9142 from hashicorp/f-hclv2-2.3
Support HCLv2 for Nomad jobs
2020-10-22 12:26:28 -05:00
Drew Bailey f3dcefe5a9
remove event durability (#9147)
* remove event durability

temporarily removing go-memdb event durability until a new strategy is developed on how to best handled increased durability needs

* drop events table schema and state store methods

* fix neweventbuffer invocations
2020-10-22 12:21:03 -04:00
Mahmood Ali d3a17b5c82 address review feedback 2020-10-22 11:49:37 -04:00
Mahmood Ali f52bda4c30 api: update /render api to parse hclv2 2020-10-21 15:46:57 -04:00
Mahmood Ali 84ec0b38e8 cli: use HCLv2 parser
Also, fallback to using HCLv1.
2020-10-21 15:46:57 -04:00
Mahmood Ali 1ae924973e hclv1: tweak HCLv1 tests
This ensures that gatway ReadOnly key is tested.  Also, update the hclv1
test-fixtures to be hclv1 compliant.
2020-10-21 14:05:46 -04:00
Drew Bailey 6c788fdccd
Events/msgtype cleanup (#9117)
* use msgtype in upsert node

adds message type to signature for upsert node, update tests, remove placeholder method

* UpsertAllocs msg type test setup

* use upsertallocs with msg type in signature

update test usage of delete node

delete placeholder msgtype method

* add msgtype to upsert evals signature, update test call sites with test setup msg type

handle snapshot upsert eval outside of FSM and ignore eval event

remove placeholder upsertevalsmsgtype

handle job plan rpc and prevent event creation for plan

msgtype cleanup upsertnodeevents

updatenodedrain msgtype

msg type 0 is a node registration event, so set the default  to the ignore type

* fix named import

* fix signature ordering on upsertnode to match
2020-10-19 09:30:15 -04:00
Drew Bailey fba0d6dc6a
event buffer size and durable count must be non negative 2020-10-15 16:34:33 -04:00
Nick Ethier 4903e5b114
Consul with CNI and host_network addresses (#9095)
* consul: advertise cni and multi host interface addresses

* structs: add service/check address_mode validation

* ar/groupservices: fetch networkstatus at hook runtime

* ar/groupservice: nil check network status getter before calling

* consul: comment network status can be nil
2020-10-15 15:32:21 -04:00
Michael Schurter ea55c497b7
Merge pull request #9094 from hashicorp/f-1.0
s/0.13/1.0/g
2020-10-15 08:53:33 -07:00
James Rasell 42a6e7140f
Merge pull request #9083 from hashicorp/b-fix-enterprise-config-merge
agent: fix enterprise config overlay merging.
2020-10-15 08:40:49 +02:00
Michael Schurter 9c3972937b s/0.13/1.0/g
1.0 here we come!
2020-10-14 15:17:47 -07:00
Michael Schurter dd09fa1a4a
Merge pull request #9055 from hashicorp/f-9017-resources
api: add field filters to /v1/{allocations,nodes}
2020-10-14 14:49:39 -07:00
Michael Schurter 6890cffd7a unify boolean parameter parsing 2020-10-14 12:23:25 -07:00
Dave May f37e90be18
Metrics gotemplate support, debug bundle features (#9067)
* add goroutine text profiles to nomad operator debug

* add server-id=all to nomad operator debug

* fix bug from changing metrics from string to []byte

* Add function to return MetricsSummary struct, metrics gotemplate support

* fix bug resolving 'server-id=all' when no servers are available

* add url to operator_debug tests

* removed test section which is used for future operator_debug.go changes

* separate metrics from operator, use only structs from go-metrics

* ensure parent directories are created as needed

* add suggested comments for text debug pprof

* move check down to where it is used

* add WaitForFiles helper function to wait for multiple files to exist

* compact metrics check

Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>

* fix github's silly apply suggestion

Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>
2020-10-14 15:16:10 -04:00
Drew Bailey c463479848
filter on additional filter keys, remove switch statement duplication
properly wire up durable event count

move newline responsibility

moves newline creation from NDJson to the http handler, json stream only encodes and sends now

ignore snapshot restore if broker is disabled

enable dev mode to access event steam without acl

use mapping instead of switch

use pointers for config sizes, remove unused ttl, simplify closed conn logic
2020-10-14 14:14:33 -04:00
Michael Schurter 8ccbd92cb6 api: add field filters to /v1/{allocations,nodes}
Fixes #9017

The ?resources=true query parameter includes resources in the object
stub listings. Specifically:

- For `/v1/nodes?resources=true` both the `NodeResources` and
  `ReservedResources` field are included.
- For `/v1/allocations?resources=true` the `AllocatedResources` field is
  included.

The ?task_states=false query parameter removes TaskStates from
/v1/allocations responses. (By default TaskStates are included.)
2020-10-14 10:35:22 -07:00
Drew Bailey 684807bddb
namespace filtering 2020-10-14 12:44:43 -04:00
Drew Bailey df96b89958
Add EvictCallbackFn to handle removing entries from go-memdb when they
are removed from the event buffer.

Wire up event buffer size config, use pointers for structs.Events
instead of copying.
2020-10-14 12:44:42 -04:00
Drew Bailey 315f77a301
rehydrate event publisher on snapshot restore
address pr feedback
2020-10-14 12:44:41 -04:00
Drew Bailey d793529d61
event durability count and cfg 2020-10-14 12:44:40 -04:00
Drew Bailey b4c135358d
use Events to wrap index and events, store in events table 2020-10-14 12:44:39 -04:00
Drew Bailey 559517455a
wire up enable_event_publisher 2020-10-14 12:44:38 -04:00
Drew Bailey 4793bb4e01
Events/deployment events (#9004)
* Node Drain events and Node Events (#8980)

Deployment status updates

handle deployment status updates (paused, failed, resume)

deployment alloc health

generate events from apply plan result

txn err check, slim down deployment event

one ndjson line per index

* consolidate down to node event + type

* fix UpdateDeploymentAllocHealth test invocations

* fix test
2020-10-14 12:44:37 -04:00
Drew Bailey a4a2975edf
Event Stream API/RPC (#8947)
This Commit adds an /v1/events/stream endpoint to stream events from.

The stream framer has been updated to include a SendFull method which
does not fragment the data between multiple frames. This essentially
treats the stream framer as a envelope to adhere to the stream framer
interface in the UI.

If the `encode` query parameter is omitted events will be streamed as
newline delimted JSON.
2020-10-14 12:44:36 -04:00
James Rasell e0734bed77
agent: fix enterprise config overlay merging. 2020-10-14 09:35:16 +02:00
Chris Baker 1d35578bed removed backwards-compatible/untagged metrics deprecated in 0.7 2020-10-13 20:18:39 +00:00
Seth Hoenig ed13e5723f consul/connect: dynamically select envoy sidecar at runtime
As newer versions of Consul are released, the minimum version of Envoy
it supports as a sidecar proxy also gets bumped. Starting with the upcoming
Consul v1.9.X series, Envoy v1.11.X will no longer be supported. Current
versions of Nomad hardcode a version of Envoy v1.11.2 to be used as the
default implementation of Connect sidecar proxy.

This PR introduces a change such that each Nomad Client will query its
local Consul for a list of Envoy proxies that it supports (https://github.com/hashicorp/consul/pull/8545)
and then launch the Connect sidecar proxy task using the latest supported version
of Envoy. If the `SupportedProxies` API component is not available from
Consul, Nomad will fallback to the old version of Envoy supported by old
versions of Consul.

Setting the meta configuration option `meta.connect.sidecar_image` or
setting the `connect.sidecar_task` stanza will take precedence as is
the current behavior for sidecar proxies.

Setting the meta configuration option `meta.connect.gateway_image`
will take precedence as is the current behavior for connect gateways.

`meta.connect.sidecar_image` and `meta.connect.gateway_image` may make
use of the special `${NOMAD_envoy_version}` variable interpolation, which
resolves to the newest version of Envoy supported by the Consul agent.

Addresses #8585 #7665
2020-10-13 09:14:12 -05:00
Yoan Blanc 891accb89a
use allow/deny instead of the colored alternatives (#9019)
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-10-12 08:47:05 -04:00
Tim Gross ecec432653
csi: allow for volume detach to work with gc'd nodes (#9057)
When we try to prefix match the `nomad volume detach` node ID argument, the
node may have been already GC'd. The volume unpublish workflow gracefully
handles this case so that we can free the claim. So make a best effort to find
a node ID among the volume's claimed allocations, or otherwise just use the
node ID we've been given by the user as-is.
2020-10-09 09:45:03 -04:00
Tim Gross 82749bd6a6
csi: allow volume detach to take a node ID prefix (#9041)
Fixes a bug where the `nomad volume detach` command would not accept a node ID
prefix instead of a full node ID. The volume ID is already prefix matched
server-side.
2020-10-07 11:14:57 -04:00
James Rasell d2fe895216
Merge pull request #9023 from hashicorp/f-gh-8648
cli: add scale and scaling-events commands to job cmd.
2020-10-06 18:03:41 +02:00
Dave May 561637c063
Merge pull request #9034 from hashicorp/dmay-debug-metrics
Add metrics command / output to debug bundle
2020-10-06 11:47:09 -04:00
Seth Hoenig 6cffbecb3a
Merge pull request #9033 from pierreca/verify-remove-checks
Do not double-remove checks removed by Consul
2020-10-06 10:16:13 -05:00
davemay99 18aa30c00f metrics return bytes instead of string for more flexibility 2020-10-06 10:49:15 -04:00
davemay99 19a075cf47 update deprecated syntax per GH-9027 2020-10-06 09:47:16 -04:00