Commit graph

21401 commits

Author SHA1 Message Date
Seth Hoenig 87be8c4c4b consul: correctly check consul acl token namespace when using consul oss
This PR fixes the Nomad Object Namespace <-> Consul ACL Token relationship
check when using Consul OSS (or Consul ENT without namespace support).

Nomad v1.1.0 introduced a regression where Nomad would fail the validation
when submitting Connect jobs and allow_unauthenticated set to true, with
Consul OSS - because it would do the namespace check against the Consul ACL
token assuming the "default" namespace, which does not work because Consul OSS
does not have namespaces.

Instead of making the bad assumption, expand the namespace check to handle
each special case explicitly.

Fixes #10718
2021-06-08 13:55:57 -05:00
Florian Apolloner ad472e8079 Fixed global-search keyboard shortcut for non-english keyboard layouts.
Closes #10646
2021-06-07 13:32:38 -04:00
James Rasell 15bb40db77
Merge pull request #10712 from hashicorp/b-gh-10711
cmd: validate the type flag when querying plugin status.
2021-06-07 18:14:20 +02:00
Mahmood Ali 9de37cf1d8
update changelog for GH-10710 (#10713)
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-06-07 10:27:53 -04:00
Mahmood Ali 2c73552b4d
pool: track usage of incoming streams (#10710)
Track usage of incoming streams on a connection. Connections without
reference counts get marked as unused and reaped in a periodic job.

This fixes a bug where `alloc exec` and `alloc fs` sessions get terminated
unexpectedly. Previously, when a client heartbeats switches between
servers, the pool connection reaper eventually identifies the connection
as unused and closes it even if it has an active exec/fs sessions.

Fixes #10579
2021-06-07 10:22:37 -04:00
James Rasell 888371a012
cmd: validate the type flag when querying plugin status. 2021-06-07 13:53:28 +02:00
Jasmine Dahilig ca4be6857e
deployment query rate limit (#10706) 2021-06-04 12:38:46 -07:00
Mahmood Ali 5ed3643148
Merge pull request #10704 from hashicorp/e2e-terraform-tweaks-20210604
e2e terraform tweaks: 2021-06 edition
2021-06-04 11:51:09 -04:00
Mahmood Ali 5258ae480b remove unused Spark security group rules 2021-06-04 11:49:43 -04:00
Seth Hoenig d9b71636e6
Merge pull request #10658 from hashicorp/f-cc-mesh-gw
consul/connect: add support for connect mesh gateways
2021-06-04 09:50:08 -05:00
Mahmood Ali b852dc5eb8 e2e: pass nomad_url variable 2021-06-04 10:32:51 -04:00
Seth Hoenig 2f99dff21b consul/connect: fix tests for mesh gateway mode 2021-06-04 09:31:38 -05:00
Mahmood Ali 71936e1b27 e2e: NOMAD_VERSION is not set when installing url 2021-06-04 10:31:37 -04:00
Mahmood Ali d0768bb999 restrict ingress ip 2021-06-04 10:31:35 -04:00
Seth Hoenig 40dccde1df
consul/connect: use range on upstream canonicalize
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-06-04 08:55:05 -05:00
Seth Hoenig 839c0cc360 consul/connect: fix upstream mesh gateway default mode setting
This PR fixes the API to _not_ set the default mesh gateway mode. Before,
the mode would be set to "none" in Canonicalize, which is incorrect. We
should pass through the empty string so that folks can make use of Consul
service-defaults Config entries to configure the default mode.
2021-06-04 08:53:12 -05:00
Seth Hoenig d026ff1f66 consul/connect: add support for connect mesh gateways
This PR implements first-class support for Nomad running Consul
Connect Mesh Gateways. Mesh gateways enable services in the Connect
mesh to make cross-DC connections via gateways, where each datacenter
may not have full node interconnectivity.

Consul docs with more information:
https://www.consul.io/docs/connect/gateways/mesh-gateway

The following group level service block can be used to establish
a Connect mesh gateway.

service {
  connect {
    gateway {
      mesh {
        // no configuration
      }
    }
  }
}

Services can make use of a mesh gateway by configuring so in their
upstream blocks, e.g.

service {
  connect {
    sidecar_service {
      proxy {
        upstreams {
          destination_name = "<service>"
          local_bind_port  = <port>
          datacenter       = "<datacenter>"
          mesh_gateway {
            mode = "<mode>"
          }
        }
      }
    }
  }
}

Typical use of a mesh gateway is to create a bridge between datacenters.
A mesh gateway should then be configured with a service port that is
mapped from a host_network configured on a WAN interface in Nomad agent
config, e.g.

client {
  host_network "public" {
    interface = "eth1"
  }
}

Create a port mapping in the group.network block for use by the mesh
gateway service from the public host_network, e.g.

network {
  mode = "bridge"
  port "mesh_wan" {
    host_network = "public"
  }
}

Use this port label for the service.port of the mesh gateway, e.g.

service {
  name = "mesh-gateway"
  port = "mesh_wan"
  connect {
    gateway {
      mesh {}
    }
  }
}

Currently Envoy is the only supported gateway implementation in Consul.
By default Nomad client will run the latest official Envoy docker image
supported by the local Consul agent. The Envoy task can be customized
by setting `meta.connect.gateway_image` in agent config or by setting
the `connect.sidecar_task` block.

Gateways require Consul 1.8.0+, enforced by the Nomad scheduler.

Closes #9446
2021-06-04 08:24:49 -05:00
Seth Hoenig 4c087efd59
Merge pull request #10702 from hashicorp/f-cc-constraints
consul/connect: use additional constraints in scheduling connect tasks
2021-06-04 08:11:21 -05:00
Tim Gross 8b2ecde5b4 csi: accept list of caps during validation in volume register
When `nomad volume create` was introduced in Nomad 1.1.0, we changed the
volume spec to take a list of capabilities rather than a single capability, to
meet the requirements of the CSI spec. When a volume is registered via `nomad
volume register`, we should be using the same fields to validate the volume
with the controller plugin.
2021-06-04 07:57:26 -04:00
Seth Hoenig d359eb6f3a consul/connect: use additional constraints in scheduling connect tasks
This PR adds two additional constraints on Connect sidecar and gateway tasks,
making sure Nomad schedules them only onto nodes where Connect is actually
enabled on the Consul agent.

Consul requires `connect.enabled = true` and `ports.grpc = <number>` to be
explicitly set on agent configuration before Connect APIs will work. Until
now, Nomad would only validate a minimum version of Consul, which would cause
confusion for users who try to run Connect tasks on nodes where Consul is not
yet sufficiently configured. These contstraints prevent job scheduling on nodes
where Connect is not actually use-able.

Closes #10700
2021-06-03 15:43:34 -05:00
Seth Hoenig 4228bc7e9d
Merge pull request #10699 from hashicorp/f-consul-fp
fingerprint: update consul fingerprinter with additional attributes
2021-06-03 15:14:25 -05:00
Seth Hoenig 77702945e2
Merge branch 'main' into f-consul-fp 2021-06-03 15:14:02 -05:00
Seth Hoenig 549c68a922
Apply suggestions from code review
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-06-03 15:12:23 -05:00
Tim Gross c01d661c98 csi: validate volume block has attachment_mode and access_mode
The `attachment_mode` and `access_mode` fields are required for CSI
volumes. The `mount_options` block is only allowed for CSI volumes.
2021-06-03 16:07:19 -04:00
Mahmood Ali 3226f70d7e
update changelog (#10701) 2021-06-03 14:54:39 -04:00
Mahmood Ali 0ac126fa78
drivers/exec: Don't inherit Nomad oom_score_adj value (#10698)
Explicitly set the `oom_score_adj` value for `exec` and `java` tasks.

We recommend that the Nomad service to have oom_score_adj of a low value
(e.g. -1000) to avoid having nomad agent OOM Killed if the node is
oversubscriped.

However, Nomad's workloads should not inherit Nomad's process, which is
the default behavior.

Fixes #10663
2021-06-03 14:15:50 -04:00
Seth Hoenig b9f04c0e88 docs: update cl 2021-06-03 12:58:16 -05:00
Seth Hoenig 3346432d58 client/fingerprint/consul: add new attributes to consul fingerprinter
This PR adds new probes for detecting these new Consul related attributes:

Consul namespaces are a Consul enterprise feature that may be disabled depending
on the enterprise license associated with the Consul servers. Having this attribute
available will enable Nomad to properly decide whether to query the Consul Namespace
API.

Consul connect must be explicitly enabled before Connect APIs will work. Currently
Nomad only checks for a minimum Consul version. Having this attribute available will
enable Nomad to properly schedule Connect tasks only on nodes with a Consul agent that
has Connect enabled.

Consul connect requires the grpc port to be explicitly set before Connect APIs will work.
Currently Nomad only checks for a minimal Consul version. Having this attribute available
will enable Nomad to schedule Connect tasks only on nodes with a Consul agent that has
the grpc listener enabled.
2021-06-03 12:49:22 -05:00
Seth Hoenig b548cf6816 client/fingerprint/consul: refactor the consul fingerprinter to test individual attributes
This PR refactors the ConsulFingerprint implementation, breaking individual attributes
into individual functions to make testing them easier. This is in preparation for
additional extractors about to be added. Behavior should be otherwise unchanged.

It adds the attribute consul.sku, which can be used to differentiate between Consul
OSS vs Consul ENT.
2021-06-03 12:48:39 -05:00
Tim Gross bc6278ca08 docs: fix broken links in nomad csi snapshot commands 2021-06-03 11:25:30 -04:00
dependabot[bot] 8a8728dbc2
build(deps): bump ws from 6.2.1 to 6.2.2 in /website (#10691)
Bumps [ws](https://github.com/websockets/ws) from 6.2.1 to 6.2.2.
- [Release notes](https://github.com/websockets/ws/releases)
- [Commits](https://github.com/websockets/ws/commits)

---
updated-dependencies:
- dependency-name: ws
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-03 10:23:19 -04:00
Jeff Escalante 57d0387e27
rotate algolia api key (#10662) 2021-06-03 10:22:16 -04:00
Tim Gross 99380aa3f0 docs: clarify default check.initial_status behavior 2021-06-03 10:02:25 -04:00
Tim Gross 37fa6850d2 scheduler: test for reconciler's in-place rollback behavior
The reconciler has some complicated behavior when there are already running
allocations from a previous version of the job that we want to keep, as
happens during a rollback. Document this behavior with a test.
2021-06-03 10:02:19 -04:00
Tim Gross dbd493d26f docs: changelog entries for 1.1.1 and backports 2021-06-03 08:50:06 -04:00
Kendall Strautman abf02f4fcb
chore: updates text-split-with-logo-grid package (#10690) 2021-06-02 17:00:55 -07:00
Tim Gross e9777a88ce plan applier: add trace-level log of plan
The plans generated by the scheduler produce high-level output of counts on each
evaluation, but when debugging scheduler issues it'd be nice to have a more
detailed view of the resulting plan. Emitting this log at trace minimizes the
overhead, and producing it in the plan applyer makes it easier to find as it
will always be on the leader.
2021-06-02 10:25:23 -04:00
Tim Gross 7a55a6af16 leader: call eval log formatting lazily
Arguments to our logger's various write methods are evaluated eagerly, so
method calls in log parameters will always be called, regardless of log
level. Move some logger messages to the logger's `Fmt` method so that
`GoString` is evaluated lazily instead.
2021-06-02 09:59:55 -04:00
Luiz Aoqui 139c5e8df9
e2e: fix terraform output environment command instruction (#10674) 2021-06-01 10:10:12 -04:00
mrspanishviking b73a848ec3
docs: added license faq 2021-05-27 13:30:17 -04:00
Grant Griffiths 3f41150fbb CSI snapshot list: do not shorten snapshot ID
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2021-05-27 13:28:18 -04:00
Mahmood Ali d8de4e62bb
Merge pull request #10657 from hashicorp/b-alloc-exec-closing
Handle `nomad exec` termination events in order
2021-05-25 14:50:58 -04:00
Mahmood Ali dfb7874da5 add a note about node connection failure and fallback 2021-05-25 14:24:24 -04:00
Mahmood Ali 0853d48927
e2e: Spin clusters with custom url binaries (#10656)
Ease spinning up a cluster, where binaries are fetched from arbitrary
urls.  These could be CircleCI `build-binaries` job artifacts, or
presigned S3 urls.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-05-25 13:47:39 -04:00
Mahmood Ali 2ebbffad12 exec: api: handle closing errors differently
refactor the api handling of `nomad exec`, and ensure that we process
all received events before handling websocket closing.

The exit code should be the last message received, and we ought to
ignore any websocket close error we receive afterwards.

Previously, we used two channels: one for websocket frames and another
for handling errors. This raised the possibility that we processed the
error before processing the frames, resulting into an "unexpected EOF"
error.
2021-05-25 11:19:42 -04:00
Tim Gross 9e23884c16 docs: changelog entry for 10539 2021-05-25 09:57:22 -04:00
Mahmood Ali 0f5539c382 exec: http: close websocket connection gracefully
In this loop, we ought to close the websocket connection gracefully when
the StreamErrWrapper reaches EOF.

Previously, it's possible that that we drop the last few events or skip sending
the websocket closure. If `handler(handlerPipe)` returns and `cancel` is called,
before the loop here completes processing streaming events, the loop exits
prematurely without propagating the last few events.

Instead here, the loop continues until we hit `httpPipe` EOF (through
`decoder.Decode`), to ensure we process the events to completion.
2021-05-24 13:37:23 -04:00
Mahmood Ali 3b7c5ff46e e2e: stop suppressing unexpected EOF errors 2021-05-24 13:35:08 -04:00
Luiz Aoqui c1ef539fa3
Display confirmation message on 'nomad volume delete' and 'nomad volume deregister' 2021-05-24 12:02:55 -04:00
Tim Gross 258352957a changelog: add missing GH link 2021-05-24 11:52:18 -04:00