This PR adds the capability of running Connect Native Tasks on Nomad,
particularly when TLS and ACLs are enabled on Consul.
The `connect` stanza now includes a `native` parameter, which can be
set to the name of task that backs the Connect Native Consul service.
There is a new Client configuration parameter for the `consul` stanza
called `share_ssl`. Like `allow_unauthenticated` the default value is
true, but recommended to be disabled in production environments. When
enabled, the Nomad Client's Consul TLS information is shared with
Connect Native tasks through the normal Consul environment variables.
This does NOT include auth or token information.
If Consul ACLs are enabled, Service Identity Tokens are automatically
and injected into the Connect Native task through the CONSUL_HTTP_TOKEN
environment variable.
Any of the automatically set environment variables can be overridden by
the Connect Native task using the `env` stanza.
Fixes#6083
> If you do not run Nomad as root, make sure you add the Nomad user to the Docker group so Nomad can communicate with the Docker daemon.
Changing the username in the example from `vagrant` to `nomad`. Vagrant isn't addressed in the entire document, so I guess that this was a mistake.
The tasklet passes the timeout for the script check into the task
driver's `Exec`, and its up to the task driver to enforce that via a
golang `context.WithDeadline`. In practice, this deadline is started
before the task driver starts setting up the execution
environment (because we need it to do things like timeout Docker API
calls).
Under even moderate load, the time it takes to set up the execution
context for the script check regularly exceeds a full second or
two. This can cause script checks to unexpected timeout or even never
execute if the context expires before the task driver ever gets a
chance to `execve`.
This changeset adds a notice to operators about setting script check
timeouts with plenty of padding and what to monitor for problems.
Creating a FAQ question to provide a home for additional context around
bootstrapping. Linking from API page to `default_server_config`
attribute. Added sample API response to to discuss "Updated: false"
The MVP for CSI in the 0.11.0 release of Nomad did not include support
for opaque volume parameters or volume context. This changeset adds
support for both.
This also moves args for ControllerValidateCapabilities into a struct.
The CSI plugin `ControllerValidateCapabilities` struct that we turn
into a CSI RPC is accumulating arguments, so moving it into a request
struct will reduce the churn of this internal API, make the plugin
code more readable, and make this method consistent with the other
plugin methods in that package.
CSI plugins can require credentials for some publishing and
unpublishing workflow RPCs. Secrets are configured at the time of
volume registration, stored in the volume struct, and then passed
around as an opaque map by Nomad to the plugins.
This changeset implements a periodic garbage collection of CSI volumes
with missing allocations. This can happen in a scenario where a node
update fails partially and the allocation updates are written to raft
but the evaluations to GC the volumes are dropped. This feature will
cover this edge case and ensure that upgrades from 0.11.0 and 0.11.1
get any stray claims cleaned up.
A few connect examples reference a fake 'test/test' image.
By replacing those with `hashicorpnomad/counter-api` we can
turn them into runnable examples.
Promote the Connect ACLs guide on the jobspec connect stanza docs
page. This was suggested in a ticket after someone got stuck not
realizing they needed to enable Consul Intentions for their connect
enabled services, which is covered in the guide.
Replace the existing top example with something that is directly
runnable on a `-dev-connect` nomad setup.
Add the _complete_ `countdash` example at the bottom in the
examples section, so that people do not need to go guide-hunting
to find a complete example. The hope is people will see more
runnable examples and be less afraid of connect.
This change replaces the top example for expose path configuration with
two new runnable examples. Users should be able to copy and paste those
jobs into a job file and run them against a basic connect enabled nomad
setup.
The example presented first demonstrates use of the service check expose
parameter with no dynamic port explicitly defined (new to 0.11.2).
This is expected to be the "90%" use case of users, and so we should
try to emphasise this pattern as best practice.
The example presented second demonstrates achieving the same goal as the
first exmaple, but utilizing the full plumbing available through the
`connect.proxy.expose` stanza. This should help readers comprehend what
is happening "under the hood".
This changeset implements a periodic garbage collection of unused CSI
plugins. Plugins are self-cleaning when the last allocation for a
plugin is stopped, but this feature will cover any missing edge cases
and ensure that upgrades from 0.11.0 and 0.11.1 get any stray plugins
cleaned up.
The Datadog agent was rewritten in Go from version 6. This means
the codebase resides in a new GitHub repository. This change
updates the Nomad telemetry configuration documentation to point
to the latest repo.
This page has not been updated (yet) to reflect that support for all 3 job types (service, batch, system) which shipped in 0.9.2.
The current page implies that preemption is only available for system jobs.
This is early preparation for Nomad 0.12, where we plan to move Preemption from Enterprise feature suite to OSS for all.
* update `restart` documentation
#7288 added support for task-specific `restart` policy. this PR updates the docs to reflect that.
* added an explicit example of task-specific restart policy
* Update website/pages/docs/job-specification/restart.mdx
This changeset adds documentation changes for the new
`proxy.expose` stanza as well as the `check.expose` parameter.
The examples are centered around proposed changes for the "countdash"
`dashboard-service` in [demo-consul-101](github.com/hashicorp/demo-consul-101/pull/6).
The dashboard service will now serve two additonal endpoints
- `/health/api`
- `/metrics`
which should serve nicely as expose-able paths.
docs for audit logging
enterprise only note
enterprise page audit blurb
add audit to sidebar
add general options audit info
fix name for docs-navigation
link to docs
more docs
add audit log entry examples
fix audit link
clean up audit log docs
Includes:
* changes `nomad alloc status` and `nomad node status`
* changes to `volume` block in jobspec
* new `csi_plugin` block in jobspec
* new `nomad volume register/deregister/status` commands
* new `nomad plugin status` command
Previously the S3 artifact documentation only referenced using
options populated with access credentails when pulling from a
bucket that required authentication. This change upadtes the doc
to also detail using IAM instance profiles.
We mistakenly reference `CONSUL_TOKEN` where we should be using `CONSUL_HTTP_TOKEN`.
There is one case where `CONSUL_TOKEN` is correct - client.mdx there is a list of
default filtered environment variables, which for legacy reasons `CONSUL_TOKEN` will
always be one.