Commit graph

131 commits

Author SHA1 Message Date
Tim Gross 3bf948dc00
docs: clarify restart inheritance and add examples (#12275)
Clarify the behavior of `restart` inheritance with respect to Connect
sidecar tasks. Remove incorrect language about the scheduler being
involved in restart decisions. Try to make the `delay` mode
documentation more clear, and provide examples of delay vs fail.
2022-03-14 15:49:08 -04:00
Merlin Scholz 68457be72c
docs: elaborate on networking issues with firewalld (#12214) 2022-03-08 09:49:29 -05:00
Ignacio Torres Masdeu 2793054147
docs: fix examples for set_contains_all and set_contains_any (#12093) 2022-03-07 13:55:57 -05:00
James Rasell 6aa741dd16
docs: add note regarding HCLv2 func and interpolation. 2022-03-04 12:06:25 +01:00
Tiernan c30b4617aa
interpolate network.dns block on client (#12021) 2022-02-16 08:39:44 -05:00
Marc-Aurèle Brothier fb80dc57a1
small typo in advertised example 2022-02-10 13:53:05 +01:00
Derek Strickland 0a8e03f0f7
Expose Consul template configuration parameters (#11606)
This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza.

- `wait`

It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza.

- `max_stale`
- `block_query_wait`
- `consul_retry`
- `vault_retry` 
- `wait` 

Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure.

- `wait_bounds`

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-01-10 10:19:07 -05:00
Tim Gross b0c3b99b03
scheduler: fix quadratic performance with spread blocks (#11712)
When the scheduler picks a node for each evaluation, the
`LimitIterator` provides at most 2 eligible nodes for the
`MaxScoreIterator` to choose from. This keeps scheduling fast while
producing acceptable results because the results are binpacked.

Jobs with a `spread` block (or node affinity) remove this limit in
order to produce correct spread scoring. This means that every
allocation within a job with a `spread` block is evaluated against
_all_ eligible nodes. Operators of large clusters have reported that
jobs with `spread` blocks that are eligible on a large number of nodes
can take longer than the nack timeout to evaluate (60s). Typical
evaluations are processed in milliseconds.

In practice, it's not necessary to evaluate every eligible node for
every allocation on large clusters, because the `RandomIterator` at
the base of the scheduler stack produces enough variation in each pass
that the likelihood of an uneven spread is negligible. Note that
feasibility is checked before the limit, so this only impacts the
number of _eligible_ nodes available for scoring, not the total number
of nodes.

This changeset sets the iterator limit for "large" `spread` block and
node affinity jobs to be equal to the number of desired
allocations. This brings an example problematic job evaluation down
from ~3min to ~10s. The included tests ensure that we have acceptable
spread results across a variety of large cluster topologies.
2021-12-21 10:10:01 -05:00
Andy Assareh 8ba4e063e2
Mesh Gateway doc enhancements (#11354)
* Mesh Gateway doc enhancements

1. I believe this line should be corrected to add mesh as one of the choices
2. I found that we are not setting this meta, and it is a required element for wan federation. I believe it would be helpful and potentially time saving to note that right here.
2021-12-20 17:10:44 -05:00
Luiz Aoqui 4b39494cd1
docs: add more references and examples to the template block (#11691) 2021-12-16 14:14:01 -05:00
Kevin Wang 3e6757f211
feat(website): extract /plugins /tools docs (#11584)
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
Co-authored-by: Mike Nomitch <mnomitch@hashicorp.com>
2021-12-09 14:25:18 -05:00
Shantanu Gadgil 0838678609
mention sysbatch in addition to batch (#11587) 2021-12-06 19:12:03 -05:00
Tim Gross ba038a1ebc
docs: mount_flags takes a slice of strings (#11583)
The `mount_flags` option takes a slice of strings, not a
comma-separated string like the flags passed to `mount(8)`.
2021-11-29 10:07:34 -05:00
Andy Assareh 8c638217ac
exactly one of ingress, terminating, or mesh must be configured
i believe mesh should be included in this statement was omitted.
2021-10-15 14:15:02 -07:00
Luiz Aoqui 63d1ac8939
docs: document that network mode is only supported on Linux (#11192)
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2021-10-01 23:17:20 -04:00
Luiz Aoqui a7872f0ba5
docs: add Nomad version requirement note for sysbatch (#11231) 2021-09-29 15:14:51 -04:00
Luiz Aoqui 8d19831247
docs: add some extra documentation around client host environment variables (#11208)
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2021-09-21 17:23:30 -04:00
James Rasell b5039c96a4
docs: add network.hostname job specification website entry. 2021-09-15 11:43:47 +02:00
James Rasell 4dd5c47a47
Merge pull request #11091 from hashicorp/consolidate-cni-plugins-to-1.0.0
cni: consolidate cni plugins within test install and docs to use v1.0.0
2021-08-30 09:39:39 +02:00
Mahmood Ali 53f11e0080
docs: note env and meta map assignment syntax (#11095) 2021-08-29 14:35:09 -04:00
James Rasell ec221ab792
docs: update website to detail cni plugins v1.0.0 2021-08-27 11:15:25 +02:00
Luiz Aoqui be2389c8ad
Update scaling and policy blocks documentation (#11071)
* website: update `scaling` and `policy` blocks documentation

* website: hclfmt examples in scaling block docs
2021-08-25 09:10:18 -04:00
James Rasell 75be5acb08
Merge pull request #11042 from hashicorp/docs-remove-ingress-host-port-callout
docs: Remove note on ingress gateway hosts field needing a port number
2021-08-25 12:29:59 +02:00
Mahmood Ali c37339a8c8
Merge pull request #9160 from hashicorp/f-sysbatch
core: implement system batch scheduler
2021-08-16 09:30:24 -04:00
Blake Covarrubias 0778ffab8c docs: Remove note on ingress gateway hosts field needing a port number
Update the ingress gateway documentation to remove the note stating
that a port must be specified for values in the `hosts` field when
the ingress gateway is listening on a non-standard HTTP port.

Specifying a port was required in Consul 1.8.0, but that requirement
was removed in 1.8.1 with hashicorp/consul#8190 which made Consul
include the port number when constructing the Envoy configuration.

Related Consul docs PR: hashicorp/consul#10827
2021-08-11 14:55:05 -07:00
Tim Gross de957b48ff docs: note CNI requirement for bridge networking
Using `bridge` networking requires that you have CNI plugins installed
on the client, but this isn't in the jobspec `network` docs which are
the first place someone will look when trying to configure task
networking.
2021-08-11 10:18:35 -04:00
Seth Hoenig 3371214431 core: implement system batch scheduler
This PR implements a new "System Batch" scheduler type. Jobs can
make use of this new scheduler by setting their type to 'sysbatch'.

Like the name implies, sysbatch can be thought of as a hybrid between
system and batch jobs - it is for running short lived jobs intended to
run on every compatible node in the cluster.

As with batch jobs, sysbatch jobs can also be periodic and/or parameterized
dispatch jobs. A sysbatch job is considered complete when it has been run
on all compatible nodes until reaching a terminal state (success or failed
on retries).

Feasibility and preemption are governed the same as with system jobs. In
this PR, the update stanza is not yet supported. The update stanza is sill
limited in functionality for the underlying system scheduler, and is
not useful yet for sysbatch jobs. Further work in #4740 will improve
support for the update stanza and deployments.

Closes #2527
2021-08-03 10:30:47 -04:00
Tim Gross 5e6aca18e4
docs: unset port to field maps to dynamic port (#10828) 2021-06-28 15:55:24 -04:00
Boris Shomodjvarac 64b1cafa57
docs: update csi_plugin example (#10821)
Current efs driver does not support telling it if its a `node` or a `controller`, and it will not print any error it will just ignore all other parameters then:(
So this will result in endpoint being `/tmp/csi.sock` and not `/csi/csi.sock` which will in turn break nomad/csi integration.

Also I changed the latest image tag to v1.3.2 to make sure anybody copy pasting this example is sure that it will work.

Tested on nomad 1.1.2
2021-06-28 08:28:03 -04:00
Tim Gross ad8eb33cd7
docs: improve CSI deployment recommendations (#10798)
* add some more context to the recommendations
* add recommendations around per-AZ `plugin_id`
2021-06-22 10:23:09 -04:00
Tim Gross 74947d6591
docs: host_network does support Docker task port mapping (#10774) 2021-06-17 09:11:10 -04:00
Seth Hoenig 839c0cc360 consul/connect: fix upstream mesh gateway default mode setting
This PR fixes the API to _not_ set the default mesh gateway mode. Before,
the mode would be set to "none" in Canonicalize, which is incorrect. We
should pass through the empty string so that folks can make use of Consul
service-defaults Config entries to configure the default mode.
2021-06-04 08:53:12 -05:00
Seth Hoenig d026ff1f66 consul/connect: add support for connect mesh gateways
This PR implements first-class support for Nomad running Consul
Connect Mesh Gateways. Mesh gateways enable services in the Connect
mesh to make cross-DC connections via gateways, where each datacenter
may not have full node interconnectivity.

Consul docs with more information:
https://www.consul.io/docs/connect/gateways/mesh-gateway

The following group level service block can be used to establish
a Connect mesh gateway.

service {
  connect {
    gateway {
      mesh {
        // no configuration
      }
    }
  }
}

Services can make use of a mesh gateway by configuring so in their
upstream blocks, e.g.

service {
  connect {
    sidecar_service {
      proxy {
        upstreams {
          destination_name = "<service>"
          local_bind_port  = <port>
          datacenter       = "<datacenter>"
          mesh_gateway {
            mode = "<mode>"
          }
        }
      }
    }
  }
}

Typical use of a mesh gateway is to create a bridge between datacenters.
A mesh gateway should then be configured with a service port that is
mapped from a host_network configured on a WAN interface in Nomad agent
config, e.g.

client {
  host_network "public" {
    interface = "eth1"
  }
}

Create a port mapping in the group.network block for use by the mesh
gateway service from the public host_network, e.g.

network {
  mode = "bridge"
  port "mesh_wan" {
    host_network = "public"
  }
}

Use this port label for the service.port of the mesh gateway, e.g.

service {
  name = "mesh-gateway"
  port = "mesh_wan"
  connect {
    gateway {
      mesh {}
    }
  }
}

Currently Envoy is the only supported gateway implementation in Consul.
By default Nomad client will run the latest official Envoy docker image
supported by the local Consul agent. The Envoy task can be customized
by setting `meta.connect.gateway_image` in agent config or by setting
the `connect.sidecar_task` block.

Gateways require Consul 1.8.0+, enforced by the Nomad scheduler.

Closes #9446
2021-06-04 08:24:49 -05:00
Tim Gross 99380aa3f0 docs: clarify default check.initial_status behavior 2021-06-03 10:02:25 -04:00
James Rasell 99128e8601
docs: fix jobspec hcl2 locals example. 2021-05-21 15:20:46 +02:00
Ahmed 8d41e22405 Update service.mdx 2021-05-17 15:41:50 -04:00
Mahmood Ali abf6418976
add a section about memory oversubscription (#10573)
add a section about memory oversubscription

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-05-13 13:35:51 -04:00
Mike Noordermeer 2445bece66
docs: clarify that a default update strategy is used when update strategy is omitted 2021-05-10 08:27:22 -04:00
Tim Gross 1fdb4c1511 documentation for disable_default_tcp_check 2021-05-07 13:16:39 -04:00
Mahmood Ali 488cd1e336 annotate 1.1 beta fields 2021-05-07 10:21:16 -04:00
Nick Ethier 42bbe6978d website: reserved cores docs 2021-05-05 08:11:41 -04:00
Andy Assareh 1616f80211 git example - suggest providing real repo
when troubleshooting it is better if this command will actually work (pointing to a real repository)
2021-05-03 08:12:10 -04:00
Mahmood Ali ba49661198
Docs memory oversubscription (#10478)
* update docs

* document memory_oversubscription_enabled scheduler config
2021-04-30 14:07:56 -04:00
Nick Spain 04e3132b4b Document usage of 'body' field 2021-04-13 09:15:35 -04:00
Tim Gross fd3d443863 docs: clean up explanation of volume per_alloc 2021-04-09 11:32:00 -04:00
Tim Gross 0892d34ff9 CSI: capability block is required for volume registration 2021-04-08 13:02:24 -04:00
Tim Gross 70f5363a89 docs: update CSI create/register fields
Add new `access_mode`/`attachment_mode` fields. Make it more clear which set
of fields belong to create vs register. Update the example spec that's
generated by `volume init`.
2021-04-07 11:24:09 -04:00
Seth Hoenig f17ba33f61 consul: plubming for specifying consul namespace in job/group
This PR adds the common OSS changes for adding support for Consul Namespaces,
which is going to be a Nomad Enterprise feature. There is no new functionality
provided by this changeset and hopefully no new bugs.
2021-04-05 10:03:19 -06:00
Bryce Kalow a6ca40fa4e
feat(website): migrates to new nav data format (#10264) 2021-03-31 08:43:17 -05:00
Tim Gross fa25e048b2
CSI: unique volume per allocation
Add a `PerAlloc` field to volume requests that directs the scheduler to test
feasibility for volumes with a source ID that includes the allocation index
suffix (ex. `[0]`), rather than the exact source ID.

Read the `PerAlloc` field when making the volume claim at the client to
determine if the allocation index suffix (ex. `[0]`) should be added to the
volume source ID.
2021-03-18 15:35:11 -04:00
Tim Gross 0b467a1e44 docs: clarify HCL is parsed in CLI 2021-03-15 15:41:43 -04:00
Luiz Aoqui 944887a684
docs: add ports to jobspec overview example 2021-03-12 13:36:39 -05:00
Tim Gross 4c4ea23e6f docs/artifact: clarify git:: prefix usage for private repos 2021-03-08 10:55:32 -05:00
Andre Ilhicas f45fc6c899
consul/connect: enable setting local_bind_address in upstream 2021-02-26 11:37:31 +00:00
Drew Bailey 86d9e1ff90
Merge pull request #9955 from hashicorp/on-update-services
Service and Check on_update configuration option (readiness checks)
2021-02-24 10:11:05 -05:00
Tim Gross b764f52ab9
deploymentwatcher: reset progress deadline on promotion (#10042)
In a deployment with two groups (ex. A and B), if group A's canary becomes
healthy before group B's, the deadline for the overall deployment will be set
to that of group A. When the deployment is promoted, if group A is done it
will not contribute to the next deadline cutoff. Group B's old deadline will
be used instead, which will be in the past and immediately trigger a
deployment progress failure. Reset the progress deadline when the job is
promotion to avoid this bug, and to better conform with implicit user
expectations around how the progress deadline should interact with promotions.
2021-02-22 16:44:03 -05:00
Tim Gross a90586f3e5 docs: clarify type constraints for collections in locals
HCL2 locals don't have type constraints, and the map syntax is interpreted as
an object. So users may need to explictly convert to map for some functions.

The `convert` function documentation was missing the required type constructor
parameter for collections.
2021-02-17 14:45:05 -05:00
Tim Gross 83088ec781 docs: clarify that block labels can't be interpolated in HCL2
Block labels are not expressions so they can't be interpolated without hacks
like `dynamic` blocks. Clarify this so that we don't confuse users who
shouldn't need to dig into the subtle nuances between expressions and blocks.
2021-02-17 11:53:06 -05:00
Tim Gross 25d5fe61d5 docs: remove artifact destination interpolation warning 2021-02-17 09:52:27 -05:00
Seth Hoenig eb8812dc3d docs: update ingress gateway runnable demo
Using the environment variable stopped working here a while back,
should be using the port label. Also upgrade to uuid-api:v5 which
supports linux/arm64.
2021-02-12 10:10:16 -06:00
Karan Sharma 26199289fc fix: docs/job-specification change mounts to mount
Since [mounts](https://www.nomadproject.io/docs/drivers/docker#mounts) is deprecated,
switch to newer `mount` in `task.config` for `docker` driver.
2021-02-10 08:24:58 -05:00
Seth Hoenig 45e0e70a50 consul/connect: enable custom sidecars to use expose checks
This PR enables jobs configured with a custom sidecar_task to make
use of the `service.expose` feature for creating checks on services
in the service mesh. Before we would check that sidecar_task had not
been set (indicating that something other than envoy may be in use,
which would not support envoy's expose feature). However Consul has
not added support for anything other than envoy and probably never
will, so having the restriction in place seems like an unnecessary
hindrance. If Consul ever does support something other than Envoy,
they will likely find a way to provide the expose feature anyway.

Fixes #9854
2021-02-09 10:49:37 -06:00
Drew Bailey b5585882e4
address pr comments 2021-02-08 13:43:05 -05:00
Drew Bailey 8507d54e3b
e2e test for on_update service checks
check_restart not compatible with on_update=ignore

reword caveat
2021-02-08 08:32:40 -05:00
Drew Bailey 82f971f289
OnUpdate configuration for services and checks
Allow for readiness type checks by configuring nomad to ignore warnings
or errors reported by a service check. This allows the deployment to
progress and while Consul handles introducing the sercive into a
resource pool once the check passes.
2021-02-08 08:32:40 -05:00
Alex Chan 768c02eaff
Correct the spelling of heirarchical/hierarchical (#9980) 2021-02-05 09:23:30 -06:00
Tim Gross 9701d292ce docs: remove mbits examples from documentation 2021-02-02 10:10:44 -05:00
James Rasell 9c0c75b226
docs: clarify where variables can be placed with HCLv2. 2021-01-26 12:29:58 +01:00
Seth Hoenig 8b05efcf88 consul/connect: Add support for Connect terminating gateways
This PR implements Nomad built-in support for running Consul Connect
terminating gateways. Such a gateway can be used by services running
inside the service mesh to access "legacy" services running outside
the service mesh while still making use of Consul's service identity
based networking and ACL policies.

https://www.consul.io/docs/connect/gateways/terminating-gateway

These gateways are declared as part of a task group level service
definition within the connect stanza.

service {
  connect {
    gateway {
      proxy {
        // envoy proxy configuration
      }
      terminating {
        // terminating-gateway configuration entry
      }
    }
  }
}

Currently Envoy is the only supported gateway implementation in
Consul. The gateay task can be customized by configuring the
connect.sidecar_task block.

When the gateway.terminating field is set, Nomad will write/update
the Configuration Entry into Consul on job submission. Because CEs
are global in scope and there may be more than one Nomad cluster
communicating with Consul, there is an assumption that any terminating
gateway defined in Nomad for a particular service will be the same
among Nomad clusters.

Gateways require Consul 1.8.0+, checked by a node constraint.

Closes #9445
2021-01-25 10:36:04 -06:00
zzhai 899334f2f0 Update syntax.mdx
"one label"
should be the singular form.
2021-01-25 08:42:59 -05:00
Tim Gross 555d031283 docs: check_restart is now supported for group services 2021-01-22 10:55:40 -05:00
Mahmood Ali f03e67712a
docs: remove timestamp hcl2 function (#9867)
timestamp isn't actually implemented
2021-01-21 10:29:50 -05:00
Luiz Aoqui 6667d4c734
docs: fix broken link 2021-01-13 11:25:48 -05:00
Luiz Aoqui 226e442b32
docs: fix HCL2 doc page code block 2021-01-13 11:10:45 -05:00
Seth Hoenig 207fe378ce docs: update countdash examples to v3 2021-01-10 17:19:39 -06:00
Chulki Lee b7b23e9955 Fix HCL2 link 2021-01-08 08:19:06 -05:00
Nick Ethier 6705f845f2
Merge pull request #9739 from hashicorp/b-alloc-netmode-ports
Use port's to value when building service address under 'alloc' addr_mode
2021-01-07 09:16:27 -05:00
Kdu Bonalume 425ad5892d Fix missing link for Consul integration
Add a link back to configuration/consul in the `service` parameter section of the `group` stanza.
2021-01-07 09:02:43 -05:00
Nick Ethier 7a6aab10bb
Apply suggestions from code review
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-01-07 08:53:54 -05:00
Nick Ethier ab01e19df3 command/agent/consul: use port's to value when building service address under 'alloc' addr_mode 2021-01-06 13:52:48 -05:00
Jeff Escalante eaaafd9dd4
implement mdx remote 2021-01-05 19:02:39 -05:00